View UPD30550_3179066.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

v r 5500? 64/32-bit microprocessor 2001 printed in japan document no. u16044ej1v0um00 (1st edition) date published august 2002 n cp(k) preliminary user ? s manual pd30550 ? 2002
preliminary user?s manual u16044ej1v0um 2 [memo]
preliminary user?s manual u16044ej1v0um 3 notes for cmos devices 1 precaution against esd for semiconductors note: strong electric field, when exposed to a mos device, can cause destruction of the gate oxide and ultimately degrade the device operation. steps must be taken to stop generation of static electricity as much as possible, and quickly dissipate it once, when it has occurred. environmental control must be adequate. when it is dry, humidifier should be used. it is recommended to avoid using insulators that easily build static electricity. semiconductor devices must be stored and transported in an anti-static container, static shielding bag or conductive material. all test and measurement tools including work bench and floor should be grounded. the operator should be grounded using wrist strap. semiconductor devices must not be touched with bare hands. similar precautions need to be taken for pw boards with semiconductor devices on it. 2 handling of unused input pins for cmos note: no connection for cmos device inputs can be cause of malfunction. if no connection is provided to the input pins, it is possible that an internal input level may be generated due to noise, etc., hence causing malfunction. cmos devices behave differently than bipolar or nmos devices. input levels of cmos devices must be fixed high or low by using a pull-up or pull-down circuitry. each unused pin should be connected to v dd or gnd with a resistor, if it is considered to have a possibility of being an output pin. all handling related to the unused pins must be judged device by device and related specifications governing the devices. 3 status before initialization of mos devices note: power-on does not necessarily define initial status of mos device. production process of mos does not define the initial operation status of the device. immediately after the power source is turned on, the devices with reset function have not yet been initialized. hence, power-on does not guarantee out-pin levels, i/o settings or contents of registers. device is not initialized until the reset signal is received. reset operation must be executed immediately after power-on for devices having reset function. v r series, v r 4000, v r 4000 series, v r 4100 series, v r 4200, v r 4300 series, v r 4400, v r 5000, v r 5000 series, v r 5000a, v r 5432, v r 5500, and v r 10000 are trademarks of nec corporation. mips is a registered trademark of mips technologies, inc. in the united states. mc68000 is a trademark of motorola inc. ibm370 is a trademark of ibm corp. pentium is a trademark of intel corp. dec vax is a trademark of digital equipment corporation. unix is a registered trademark in the united states and other countries, licensed exclusively through x/open company ltd.
preliminary user ? s manual u16044ej1v0um 4 ? the information contained in this document is being issued in advance of the production cycle for the device. the parameters for the device may change before final production or nec corporation, at its own discretion, may withdraw the device prior to its production. ? not all devices/types available in every country. please check with local nec representative for availability and additional information. ? no part of this document may be copied or reproduced in any form or by any means without the prior written consent of nec corporation. nec corporation assumes no responsibility for any errors which may appear in this document. ? nec corporation does not assume any liability for infringement of patents, copyrights or other intellectual property rights of third parties by or arising from use of a device described herein or any other liability arising from use of such device. no license, either express, implied or otherwise, is granted under any patents, c opyrights or other intellectual property rights of nec corporation or others. ? descriptions of circuits, software, and other related information in this document are provided for illustrative purposes in semiconductor product operation and application examples. the incorporation of these circuits, software, and information in the design of the customer's equipment shall be done under the full responsibility of the customer. nec corporation assumes no responsibility for any losses incurred by the customer or third parties arising from the use of these circuits, software, and information. ? while nec corporation has been making continuous effort to enhance the reliability of its semiconductor devices, the possibility of defects cannot be eliminated entirely. to minimize risks of damage or injury to persons or property arising from a defect in an nec semiconductor device, customers must incorporate sufficient safety measures in its design, such as redundancy, fire-containment, and anti-failure features. ? nec devices are classified into the following three quality grades: "standard", "special", and "specific". the specific quality grade applies only to devices developed based on a customer designated "quality assurance program" for a specific application. the recommended applications of a device depend on its quality grade, as indicated below. customers must check the quality grade of each device before using it in a particular application. standard: computers, office equipment, communications equipment, test and measurement equipment, audio and visual equipment, home electronic appliances, machine tools, personal electronic equipment and industrial robots special: transportation equipment (automobiles, trains, ships, etc.), traffic control systems, anti-disaster systems, anti-crime systems, safety equipment and medical equipment (not specifically designed for life support) specific: aircraft, aerospace equipment, submersible repeaters, nuclear reactor control systems, life support systems or medical equipment for life support, etc. the quality grade of nec devices is "standard" unless otherwise specified in nec's data sheets or data books. if customers intend to use nec devices for applications other than those specified for standard quality grade, they should contact an nec sales representative in advance. m5d 98. 12 exporting this product or equipment that includes this product may require a governmental license from the u.s.a. for some countries because this product utilizes technologies limited by the export control regulations of the u.s.a.
preliminary user ? s manual u16044ej1v0um 5 regional information some information contained in this document may vary from country to country. before using any nec product in your application, piease contact the nec office in your country to obtain a list of authorized representatives and distributors. they will verify: ? device availability ? ordering information ? product release schedule ? availability of related technical literature ? development environment specifications (for example, specifications for third-party tools and components, host computers, power plugs, ac supply voltages, and so forth) ? network requirements in addition, trademarks, registered trademarks, export restrictions, and other legal issues may also vary from country to country. nec electronics inc. (u.s.) santa clara, california tel: 408-588-6000 800-366-9782 fax: 408-588-6130 800-729-9288 nec electronics hong kong ltd. hong kong tel: 2886-9318 fax: 2886-9022/9044 nec electronics hong kong ltd. seoul branch seoul, korea tel: 02-528-0303 fax: 02-528-4411 nec electronics shanghai, ltd. shanghai, p.r. china tel: 021-6841-1138 fax: 021-6841-1137 nec electronics taiwan ltd. taipei, taiwan tel: 02-2719-2377 fax: 02-2719-5951 nec electronics singapore pte. ltd. novena square, singapore tel: 253-8311 fax: 250-3583 nec do brasil s.a. electron devices division guarulhos-sp, brasil tel: 11-6462-6810 fax: 11-6462-6829 j02.4 nec electronics (europe) gmbh duesseldorf, germany tel: 0211-65 03 01 fax: 0211-65 03 327 sucursal en espa ? a madrid, spain tel: 091-504 27 87 fax: 091-504 28 60 v lizy-villacoublay, france tel: 01-30-67 58 00 fax: 01-30-67 58 99 succursale fran ? aise filiale italiana milano, italy tel: 02-66 75 41 fax: 02-66 75 42 99 branch the netherlands eindhoven, the netherlands tel: 040-244 58 45 fax: 040-244 45 80 branch sweden taeby, sweden tel: 08-63 80 820 fax: 08-63 80 388 united kingdom branch milton keynes, uk tel: 01908-691-133 fax: 01908-670-290
preliminary user?s manual u16044ej1v0um 6 introduction readers this manual is intended for users who wish to understand the functions of the v r 5500 ( pd30550) and to develop application systems using this microprocessor. purpose this manual introduces the architecture and hardware functions of the v r 5500 to users, following the organization described below. organization this manual consists of the following contents. introduction pipeline operation cache organization and memory management system exception processing floating-point unit operation hardware instruction set details how to read this manual it is assumed that the reader of this manual has general knowledge in the fields of electrical engineering, logic circuits, and microcontrollers. the v r 4400 tm in this manual includes the v r 4000 tm . the v r 4000 series tm in this document indicates the v r 4100 series tm , v r 4200 tm , v r 4300 series tm , and v r 4400. to learn in detail about the function of a specific instruction, read chapter 3 outline of instruction set , chapter 7 floating-point unit, chapter 17 cpu instruction set , and chapter 18 fpu instruction set . to know about the overall functions of the v r 5500: read this manual in the order of the contents. to know about electrical specifications of the v r 5500: refer to data sheet which is separately available. conventions data significance: higher digits on the left and lower digits on the right active low representation: xxx# (trailing # after pin and signal names) note : footnote for item marked with note in the text caution : information requiring particular attention remark : supplementary information numerical representation: binary... xxxx or xxxx 2 decimal?xxxx hexadecimal ... 0xxxxx prefix indicating the power of 2 (address space, memory capacity): k (kilo) 2 10 = 1,024 m (mega) 2 20 = 1,024 2 g (giga) 2 30 = 1,024 3 t (tera) 2 40 = 1,024 4 p (peta) 2 50 = 1,024 5 e (exa) 2 60 = 1,024 6
preliminary user?s manual u16044ej1v0um 7 related documents the related documents indicated in this publication may include preliminary versions. however preliminary versions are not marked as such. documents related to devices document name document no. pd30550 (v r 5500) data sheet to be prepared v r 5500 user's manual this manual v r 5432 tm user's manual volume 1 u13751e v r 5432 user's manual volume 2 u15397e v r 5000 tm , v r 5000a tm user's manual u11761e v r 5000, v r 10000 tm instruction user's manual u12754e application note document name document no. v r series tm programming guide application note u10710e
preliminary user?s manual u16044ej1v0um 8 contents chapter 1 general............................................................................................................. .................25 1.1 features .................................................................................................................... ...................25 1.2 ordering information ........................................................................................................ ..........26 1.3 v r 5500 processor................................................................................................................. .......26 1.3.1 internal block configuration .............................................................................................. ................ 28 1.3.2 cpu registers............................................................................................................. ...................... 30 1.3.3 coprocessors .............................................................................................................. ..................... 31 1.3.4 system control coprocessors (cp0)......................................................................................... ........ 32 1.3.5 floating-point unit ....................................................................................................... ..................... 33 1.3.6 cache memory.............................................................................................................. ................... 33 1.4 outline of instruction set .................................................................................................. .........34 1.5 data format and addressing .................................................................................................. ...35 1.6 memory management system.................................................................................................... 38 1.6.1 high-speed translation lookaside buffer (tlb)............................................................................. .... 38 1.6.2 processor modes ........................................................................................................... .................. 38 1.7 instruction pipeline ........................................................................................................ .............38 1.7.1 branch prediction ......................................................................................................... .................... 38 chapter 2 pin functions ...................................................................................................... ............39 2.1 pin configuration ........................................................................................................... .............39 2.2 pin functions............................................................................................................... ................43 2.2.1 system interface signals .................................................................................................. ................ 43 2.2.2 initialization interface signals .......................................................................................... ................. 44 2.2.3 interrupt interface signals............................................................................................... .................. 46 2.2.4 clock interface signals ................................................................................................... .................. 46 2.2.5 power supply.............................................................................................................. ...................... 46 2.2.6 test interface signal..................................................................................................... .................... 47 2.3 handling of unused pins..................................................................................................... .......48 2.3.1 system interface pin...................................................................................................... ................... 48 2.3.2 test interface pins....................................................................................................... ..................... 49 chapter 3 outline of instruction set.......................................................................................50 3.1 instruction set architecture ................................................................................................ .......50 3.1.1 instruction format ........................................................................................................ ..................... 51 3.1.2 load/store instructions................................................................................................... .................. 52 3.1.3 operation instructions .................................................................................................... .................. 55 3.1.4 jump/branch instructions .................................................................................................. ............... 55 3.1.5 special instructions ...................................................................................................... .................... 56 3.1.6 coprocessor instructions .................................................................................................. ............... 56 3.2 addition and modification of v r 5500 instructions...................................................................57 3.2.1 integer rotate instructions ............................................................................................... ................. 57 3.2.2 sum-of-products instructions .............................................................................................. ............. 58 3.2.3 register scan instructions................................................................................................ ................ 59 3.2.4 floating-point load/store instructions .................................................................................... ........... 59 3.2.5 other additional instructions ............................................................................................. ............... 59
preliminary user?s manual u16044ej1v0um 9 3.2.6 instructions for which functions and operations were changed ....................................................... 60 3.3 outline of cpu instruction set .............................................................................................. .... 60 3.3.1 load and store instructions ............................................................................................... .............. 60 3.3.2 computational instructions ................................................................................................ .............. 63 3.3.3 jump and branch instructions.............................................................................................. ............ 72 3.3.4 special instructions...................................................................................................... .................... 75 3.3.5 coprocessor instructions .................................................................................................. ............... 77 3.3.6 system control coprocessor (cp0) instructions............................................................................. .. 78 chapter 4 pipeline ........................................................................................................... .................. 80 4.1 overview .................................................................................................................... .................. 80 4.1.1 pipeline stages ........................................................................................................... ..................... 81 4.1.2 configuration of pipeline................................................................................................. ................. 82 4.2 branch delay................................................................................................................ ............... 85 4.3 load delay.................................................................................................................. ................. 86 4.3.1 non-blocking load......................................................................................................... ................... 86 4.4 exception processing ........................................................................................................ ........ 87 4.5 store buffer ................................................................................................................ ................. 87 4.6 write transaction buffer.................................................................................................... ........ 87 chapter 5 memory management system .................................................................................. 88 5.1 processor modes............................................................................................................. ........... 88 5.1.1 operating modes ........................................................................................................... .................. 88 5.1.2 instruction set modes ..................................................................................................... ................. 89 5.1.3 addressing modes .......................................................................................................... ................. 89 5.2 translation lookaside buffer (tlb) ......................................................................................... 9 0 5.2.1 format of tlb entry....................................................................................................... .................. 91 5.2.2 tlb instructions.......................................................................................................... ..................... 92 5.2.3 tlb exception............................................................................................................. ..................... 92 5.3 virtual-to-physical address translation .................................................................................. 93 5.3.1 32-bit addressing mode address translation................................................................................ .... 96 5.3.2 64-bit addressing mode address translation................................................................................ .... 97 5.4 virtual address space....................................................................................................... ......... 98 5.4.1 user mode virtual address space ........................................................................................... ......... 99 5.4.2 supervisor mode virtual address space ..................................................................................... ... 101 5.4.3 kernel mode virtual address space ......................................................................................... ...... 104 5.5 memory management registers.............................................................................................. 111 5.5.1 index register (0) ........................................................................................................ ................... 112 5.5.2 random register (1)....................................................................................................... ................ 112 5.5.3 entrylo0 (2) and entrylo1 (3) registers ................................................................................... ..... 113 5.5.4 pagemask register (5) ..................................................................................................... .............. 115 5.5.5 wired register (6) ........................................................................................................ ................... 116 5.5.6 entryhi register (10)..................................................................................................... .................. 117 5.5.7 prid (processor revision id) register (15) ................................................................................ ..... 118 5.5.8 config register (16) ...................................................................................................... .................. 118 5.5.9 lladdr (load linked address) register (17) ................................................................................ .... 121 5.5.10 taglo (28) and taghi (29) registers ...................................................................................... ....... 122
preliminary user?s manual u16044ej1v0um 10 chapter 6 exception processing ............................................................................................1 23 6.1 exception processing operation.............................................................................................1 23 6.2 exception processing registers .............................................................................................1 24 6.2.1 context register (4) ...................................................................................................... .................. 125 6.2.2 badvaddr register (8) ..................................................................................................... ............... 126 6.2.3 count register (9) ........................................................................................................ ................... 127 6.2.4 compare register (11) ..................................................................................................... ............... 127 6.2.5 status register (12)...................................................................................................... ................... 128 6.2.6 cause register (13) ....................................................................................................... ................. 131 6.2.7 epc (exception program counter) register (14) ............................................................................. 133 6.2.8 watchlo (18) and watchhi (19) registers................................................................................... ... 134 6.2.9 xcontext register (20) .................................................................................................... ................ 135 6.2.10 performance counter register (25) ........................................................................................ ........ 136 6.2.11 parity error register (26)............................................................................................... .................. 138 6.2.12 cache error register (27) ................................................................................................ ............... 139 6.2.13 errorepc register (30) ................................................................................................... ................ 140 6.3 details of exceptions ....................................................................................................... .........141 6.3.1 exception types........................................................................................................... ................... 141 6.3.2 exception vector address.................................................................................................. ............. 143 6.3.3 priority of exceptions.................................................................................................... .................. 146 6.4 details of exceptions ....................................................................................................... .........147 6.4.1 reset exception ........................................................................................................... .................. 147 6.4.2 soft reset exception ...................................................................................................... ................. 148 6.4.3 nmi exception ............................................................................................................. ................... 149 6.4.4 address error exception ................................................................................................... .............. 150 6.4.5 tlb exceptions ............................................................................................................ .................. 152 6.4.6 cache error exception..................................................................................................... ............... 155 6.4.7 bus error exception ....................................................................................................... ................. 156 6.4.8 system call exception ..................................................................................................... ............... 157 6.4.9 breakpoint exception ...................................................................................................... ............... 157 6.4.10 coprocessor unusable exception........................................................................................... ........ 158 6.4.11 reserved instruction exception........................................................................................... ........... 159 6.4.12 trap exception ........................................................................................................... .................... 159 6.4.13 integer overflow exception ............................................................................................... .............. 160 6.4.14 floating-point operation exception....................................................................................... .......... 160 6.4.15 watch exception .......................................................................................................... .................. 161 6.4.16 interrupt exception ...................................................................................................... ................... 162 6.5 exception processing flowcharts...........................................................................................16 3 chapter 7 floating-point unit ................................................................................................ ....170 7.1 overview .................................................................................................................... ................170 7.2 fpu registers............................................................................................................... .............170 7.2.1 floating-point general-purpose registers (fgrs)........................................................................... 171 7.2.2 floating-point registers (fprs) ........................................................................................... ........... 172 7.2.3 floating-point control registers (fcrs) ................................................................................... ....... 172 7.3 floating-point control register............................................................................................. ..173 7.3.1 control/status register (fcr31)........................................................................................... .......... 173 7.3.2 enable/mode register (fcr28) .............................................................................................. ........ 176
preliminary user?s manual u16044ej1v0um 11 7.3.3 cause/flag register (fcr26)............................................................................................... .......... 176 7.3.4 condition code register (fcr25) ........................................................................................... ....... 176 7.3.5 implementation/revision register (fcr0)................................................................................... ... 177 7.4 data format................................................................................................................. .............. 178 7.4.1 floating-point format..................................................................................................... ................. 178 7.4.2 fixed-point format........................................................................................................ .................. 180 7.5 outline of fpu instruction set .............................................................................................. .. 181 7.5.1 floating-point load/store/transfer instructions........................................................................... ..... 182 7.5.2 conversion instructions ................................................................................................... .............. 185 7.5.3 operation instructions.................................................................................................... ................ 187 7.5.4 comparison instruction .................................................................................................... .............. 189 7.5.5 fpu branch instructions ................................................................................................... ............. 190 7.5.6 other instructions ........................................................................................................ .................. 190 7.6 execution time of fpu instruction ......................................................................................... 19 1 chapter 8 floating-point exceptions..................................................................................... 193 8.1 types of exceptions......................................................................................................... ........ 193 8.2 exception processing ........................................................................................................ ...... 194 8.2.1 flag...................................................................................................................... .......................... 194 8.3 details of exceptions ....................................................................................................... ........ 196 8.3.1 inexact operation exception (i) ........................................................................................... ........... 196 8.3.2 invalid operation exception (v) ........................................................................................... ........... 197 8.3.3 division-by-zero exception (z)............................................................................................ ........... 197 8.3.4 overflow exception (o) .................................................................................................... .............. 198 8.3.5 underflow exception (u) ................................................................................................... ............. 198 8.3.6 unimplemented operation exception (e) ..................................................................................... .. 199 8.4 saving and restoring status................................................................................................. .. 200 8.5 handler for ieee754 exceptions ............................................................................................. 2 00 chapter 9 initialization interface ........................................................................................... 201 9.1 functional outline .......................................................................................................... .......... 201 9.2 reset sequence .............................................................................................................. .......... 202 9.2.1 power-on reset ............................................................................................................ .................. 202 9.2.2 cold reset ................................................................................................................ ...................... 203 9.2.3 warm reset ................................................................................................................ .................... 204 9.2.4 processor status at reset................................................................................................. .............. 204 9.3 initialization signals ...................................................................................................... ........... 205 chapter 10 clock interface ................................................................................................... .... 206 10.1 term definitions ........................................................................................................... ............ 206 10.2 basic system clock......................................................................................................... ......... 207 10.2.1 synchronization with sysclock........................................................................................... .......... 208 10.3 phase lock loop (pll)...................................................................................................... ...... 208 chapter 11 cache memory...................................................................................................... ...... 209 11.1 memory organization ........................................................................................................ ....... 209 11.1.1 internal cache ........................................................................................................... ..................... 210 11.2 configuration of cache ..................................................................................................... ....... 211
preliminary user?s manual u16044ej1v0um 12 11.2.1 configuration of instruction cache....................................................................................... ........... 211 11.2.2 configuration of data cache .............................................................................................. ............. 212 11.2.3 location of data cache................................................................................................... ................ 212 11.3 cache operations.......................................................................................................... ...........213 11.3.1 coherency of cache data .................................................................................................. ............. 213 11.3.2 replacing instruction cache line ......................................................................................... ........... 214 11.3.3 replacing data cache line................................................................................................ .............. 214 11.3.4 speculative replacement of data cache line ............................................................................... ... 215 11.3.5 accessing cache .......................................................................................................... .................. 216 11.4 status of cache ............................................................................................................ .............217 11.5 manipulating cache by external agent...................................................................................217 chapter 12 overview of system interface ...........................................................................218 12.1 definition of terms........................................................................................................ ............218 12.2 bus modes .................................................................................................................. ...............219 12.3 outline of system interface................................................................................................ ......220 12.3.1 interface bus ............................................................................................................ ...................... 220 12.3.2 address cycle and data cycle ............................................................................................. ........... 221 12.3.3 issuance cycle ........................................................................................................... .................... 221 12.3.4 handshake signal ......................................................................................................... ................. 222 12.3.5 system interface bus data................................................................................................ .............. 223 12.4 system interface protocol .................................................................................................. ......224 12.4.1 master status and slave status ........................................................................................... ........... 224 12.4.2 external arbitration ..................................................................................................... .................... 225 12.4.3 uncompelled transition to slave status ................................................................................... ....... 225 12.4.4 processor requests and external requests ................................................................................. ... 226 12.5 processor requests......................................................................................................... .........227 12.5.1 processor read request................................................................................................... ............... 228 12.5.2 processor write request .................................................................................................. ............... 228 12.6 external requests .......................................................................................................... ...........229 12.6.1 external write request ................................................................................................... ................. 230 12.6.2 read response............................................................................................................ ................... 230 12.7 event processing ........................................................................................................... ...........231 12.7.1 load miss................................................................................................................ ....................... 231 12.7.2 store miss ............................................................................................................... ....................... 232 12.7.3 store hit................................................................................................................ .......................... 232 12.7.4 load/store in uncached area .............................................................................................. ........... 232 12.7.5 accelerated store in uncached area ....................................................................................... ....... 232 12.7.6 instruction fetch from uncached area..................................................................................... ........ 233 12.7.7 fetch miss............................................................................................................... ....................... 233 12.8 error check function....................................................................................................... .........234 12.8.1 parity error check ....................................................................................................... .................... 234 12.8.2 error check operation.................................................................................................... ................. 235 chapter 13 system interface (64-bit bus mode) ..............................................................237 13.1 protocol of processor requests............................................................................................. .238 13.1.1 processor read request protocol .......................................................................................... .......... 238 13.1.2 processor write request protocol......................................................................................... ........... 239
preliminary user?s manual u16044ej1v0um 13 13.1.3 control of processor request flow ........................................................................................ .......... 241 13.1.4 timing mode of processor request ......................................................................................... ....... 242 13.2 protocol of external request ............................................................................................... ... 246 13.2.1 external arbitration protocol............................................................................................ ............... 246 13.2.2 external null request protocol ........................................................................................... ............. 248 13.2.3 external write request protocol .......................................................................................... ............ 249 13.2.4 read response protocol ................................................................................................... ............. 250 13.2.5 sysadc(7:0) protocol for block read response ............................................................................. 252 13.3 data flow control .......................................................................................................... ........... 252 13.3.1 data rate control ........................................................................................................ .................... 252 13.3.2 block write data transfer pattern ........................................................................................ ............ 253 13.3.3 system endianness ........................................................................................................ ............... 253 13.4 independent transfer with sysad bus .................................................................................. 254 13.5 system interface cycle time ................................................................................................ ... 254 13.6 system interface commands and data identifiers................................................................ 255 13.6.1 syntax of commands and data identifiers.................................................................................. .... 255 13.6.2 syntax of command ........................................................................................................ ............... 255 13.6.3 syntax of data identifier ................................................................................................ ................. 258 13.7 system interface address................................................................................................... ..... 260 13.7.1 address specification rules.............................................................................................. .............. 260 13.7.2 sub-block ordering ....................................................................................................... ................. 260 13.7.3 processor internal address map ........................................................................................... ......... 260 chapter 14 system interface (32-bit bus mode) ............................................................. 261 14.1 protocol of processor requests ............................................................................................. 262 14.1.1 processor read request protocol .......................................................................................... ......... 262 14.1.2 processor write request protocol ......................................................................................... .......... 263 14.1.3 control of processor request flow ........................................................................................ .......... 265 14.1.4 timing mode of processor request ......................................................................................... ....... 267 14.2 protocol of external request ............................................................................................... ... 271 14.2.1 external arbitration protocol............................................................................................ ............... 271 14.2.2 external null request protocol ........................................................................................... ............. 273 14.2.3 external write request protocol .......................................................................................... ............ 274 14.2.4 read response protocol ................................................................................................... ............. 275 14.2.5 sysadc(3:0) protocol for block read response ............................................................................. 277 14.3 data flow control .......................................................................................................... ........... 277 14.3.1 data rate control ........................................................................................................ .................... 277 14.3.2 block write data transfer pattern ........................................................................................ ............ 278 14.3.3 word transfer sequence ................................................................................................... ............. 279 14.3.4 system endianness ........................................................................................................ ............... 281 14.4 independent transfer with sysad bus .................................................................................. 282 14.5 system interface cycle time ................................................................................................ ... 282 14.6 system interface commands and data identifiers................................................................ 283 14.6.1 syntax of commands and data identifiers.................................................................................. .... 283 14.6.2 syntax of command ........................................................................................................ ............... 283 14.6.3 syntax of data identifier ................................................................................................ ................. 286 14.7 system interface address................................................................................................... ..... 288 14.7.1 address specification rules.............................................................................................. .............. 288
preliminary user?s manual u16044ej1v0um 14 14.7.2 sub-block ordering ....................................................................................................... .................. 288 14.7.3 processor internal address map ........................................................................................... ......... 288 chapter 15 system interface (out-of-order return mode) .....................................289 15.1 overview ................................................................................................................... .................290 15.1.1 timing mode .............................................................................................................. .................... 290 15.1.2 master status and slave status ........................................................................................... ........... 291 15.1.3 identifying request...................................................................................................... .................... 291 15.2 protocol of out-of-order return mode ...................................................................................292 15.2.1 successive read requests................................................................................................. ............. 293 15.2.2 successive write requests ................................................................................................ ............. 296 15.2.3 write request following read request..................................................................................... ......... 298 15.2.4 bus arbitration of processor ............................................................................................. .............. 299 15.2.5 single read request following block read request ......................................................................... . 302 15.2.6 unaligned 2-word read request............................................................................................ .......... 305 15.3 system interface commands and data identifiers ................................................................306 15.3.1 syntax of commands and data identifiers .................................................................................. .... 306 15.3.2 syntax of command ........................................................................................................ ............... 306 15.3.3 syntax of data identifier ................................................................................................ ................. 311 15.4 request identifier ......................................................................................................... .............313 chapter 16 interrupts ....................................................................................................... ...........314 16.1 interrupt request type ..................................................................................................... ........314 16.1.1 non-maskable interrupt (nmi)............................................................................................. ........... 314 16.1.2 external ordinary interrupt.............................................................................................. ................ 315 16.1.3 software interrupts ...................................................................................................... ................... 315 16.1.4 timer interrupt.......................................................................................................... ...................... 315 16.2 acknowledging interrupt request signal...............................................................................315 16.2.1 detecting hardware interrupt............................................................................................. ............. 317 16.2.2 masking interrupt signal ................................................................................................. ................ 318 chapter 17 cpu instruction set ............................................................................................ ..319 17.1 instruction notation conventions ........................................................................................... 319 17.2 cautions on using cpu instructions ......................................................................................321 17.2.1 load and store instructions.............................................................................................. .............. 321 17.2.2 jump and branch instructions ............................................................................................. ........... 322 17.2.3 coprocessor instructions ................................................................................................. .............. 322 17.2.4 system control coprocessor (cp0) instructions ............................................................................ . 323 17.3 cpu instruction ............................................................................................................ .............323 17.4 cpu instruction opcode bit encoding ...................................................................................523 chapter 18 fpu instruction set ............................................................................................ ..526 18.1 type of instruction ........................................................................................................ ............526 18.1.1 data format .............................................................................................................. ...................... 529 18.2 instruction notation conventions ........................................................................................... 530 18.3 cautions on using fpu instructions.......................................................................................53 2 18.3.1 load and store instructions.............................................................................................. .............. 532 18.3.2 floating-point operation instructions .................................................................................... .......... 533
preliminary user?s manual u16044ej1v0um 15 18.3.3 fpu branch instruction ................................................................................................... ............... 534 18.4 fpu instruction ........................................................................................................... ............. 534 18.5 fpu instruction opcode bit encoding ................................................................................... 613 chapter 19 instruction hazards ............................................................................................ 6 15 19.1 overview ................................................................................................................... ................. 615 19.2 details of instruction hazard .............................................................................................. ..... 615 chapter 20 pll passive elements........................................................................................... 616 chapter 21 debugging and testing ....................................................................................... 617 21.1 overview ................................................................................................................... ................. 617 21.2 test interface signals..................................................................................................... .......... 619 21.3 boundary scan .............................................................................................................. ........... 621 21.4 connecting debugging tool .................................................................................................. . 623 21.4.1 connecting in-circuit emulator and target board.......................................................................... .. 623 21.4.2 connection circuit example ............................................................................................... ............ 625 appendix a sub-block order ................................................................................................. ... 626 appendix b recommended power supply circuit ........................................................... 629 appendix c restrictions on v r 5500......................................................................................... 630 c.1 restrictions on ver.1.x..................................................................................................... ........ 630 c.1.1 during normal operation ................................................................................................... ............. 630 c.1.2 when debug function is used ............................................................................................... ......... 631 c.2 restrictions on ver. 2.0.................................................................................................... ........ 633 c.2.1 during normal operation ................................................................................................... ............. 633 c.2.2 when using debug function ................................................................................................. .......... 634 c.3 restrictions on ver. 2.1 or later ........................................................................................... .. 635 c.3.1 during normal operation ................................................................................................... ............. 635 c.3.2 when using debug function ................................................................................................. .......... 635
preliminary user?s manual u16044ej1v0um 16 list of figures (1/5) figure no. title page 1-1 internal block diagram...................................................................................................... .......................... 27 1-2 cpu registers ............................................................................................................... ............................. 31 1-3 fpu registers............................................................................................................... .............................. 33 1-4 instruction type ............................................................................................................ .............................. 34 1-5 byte address of big endian .................................................................................................. ...................... 35 1-6 byte address of little endian............................................................................................... ....................... 36 1-7 byte address (unaligned word) ............................................................................................... .................. 37 3-1 expansion of mips architecture .............................................................................................. ................... 50 3-2 instruction format .......................................................................................................... ............................. 51 3-3 byte specification related to load and store instructions ................................................................... ...... 54 4-1 pipeline stages of v r 5500 and instruction flow......................................................................................... 81 4-2 combination of instructions that can be packed .............................................................................. ........ 83 4-3 instruction flow in execution pipeline ...................................................................................... .................. 84 4-4 branch delay ................................................................................................................ .............................. 85 4-5 load delay.................................................................................................................. ................................ 86 4-6 exception detection ......................................................................................................... ........................... 87 5-1 format of tlb entry......................................................................................................... ........................... 91 5-2 outline of tlb manipulation................................................................................................. ....................... 92 5-3 virtual-to-physical address translation ..................................................................................... ................. 94 5-4 tlb address translation ..................................................................................................... ....................... 95 5-5 virtual address translation in 32-bit addressing mode....................................................................... ....... 96 5-6 virtual address translation in 64-bit addressing mode....................................................................... ....... 97 5-7 user mode address space ..................................................................................................... .................. 100 5-8 supervisor mode address space ............................................................................................... .............. 102 5-9 kernel mode address space ................................................................................................... ................. 105 5-10 xkphys area address space.................................................................................................. ................... 106 5-11 index register............................................................................................................. .............................. 112 5-12 random register ............................................................................................................ .......................... 112 5-13 entrylo0 and entrylo1 registers ............................................................................................ ................. 113 5-14 pagemask register.......................................................................................................... ......................... 115 5-15 positions indicated by wired register ...................................................................................... ................ 116 5-16 wired register ............................................................................................................. ............................. 116 5-17 entryhi register........................................................................................................... ............................. 117 5-18 prid register.............................................................................................................. .............................. 118 5-19 config register ............................................................................................................ ............................. 119 5-20 lladdr register ............................................................................................................ ............................ 121 5-21 taglo and taglo registers .................................................................................................. ................... 122 6-1 context register ............................................................................................................ ........................... 125 6-2 badvaddr register ........................................................................................................... ........................ 126
preliminary user?s manual u16044ej1v0um 17 list of figures (2/5) figure no. title page 6-3 count register.............................................................................................................. ............................ 127 6-4 compare register format ..................................................................................................... ................... 127 6-5 status register ............................................................................................................. ............................ 128 6-6 status register diagnostic status field ..................................................................................... .............. 129 6-7 cause register.............................................................................................................. ........................... 131 6-8 epc register................................................................................................................ ............................ 133 6-9 watchlo and watchhi registers............................................................................................... ............... 134 6-10 xcontext register .......................................................................................................... .......................... 135 6-11 performance counter register ............................................................................................... .................. 136 6-12 parity error register ...................................................................................................... ........................... 138 6-13 cache error register....................................................................................................... ......................... 139 6-14 errorepc register .......................................................................................................... .......................... 140 6-15 general exception processing ............................................................................................... .................. 164 6-16 tlb/xtlb refill exception processing ....................................................................................... ............. 166 6-17 processing of cache error exception........................................................................................ ............... 168 6-18 processing of reset/soft reset/nmi exceptions .............................................................................. ....... 169 7-1 registers of fpu ............................................................................................................ .......................... 170 7-2 fcr31 ....................................................................................................................... ............................... 173 7-3 cause/enable/flag bits of fcr31............................................................................................. ............... 173 7-4 fcr28 ....................................................................................................................... ............................... 176 7-5 fcr26 ....................................................................................................................... ............................... 176 7-6 fcr25 ....................................................................................................................... ............................... 176 7-7 fcr0 ........................................................................................................................ ................................ 177 7-8 single-precision floating-point format ...................................................................................... .............. 178 7-9 double-precision floating-point format...................................................................................... ............. 178 7-10 32-bit fixed-point format.................................................................................................. ....................... 180 7-11 64-bit fixed-point format.................................................................................................. ....................... 180 8-1 cause/enable/flag bits of fcr31............................................................................................. ............... 194 9-1 power-on reset timing ....................................................................................................... ..................... 203 9-2 cold reset timing ........................................................................................................... ......................... 203 9-3 warm reset timing ........................................................................................................... ....................... 204 10-1 signal?s transition points ................................................................................................. ........................ 206 10-2 clock-q delay .............................................................................................................. ............................ 206 10-3 when frequency ratio of sysclock to pclock is 1:2.......................................................................... ..... 207 11-1 logical hierarchy of memory................................................................................................ .................... 209 11-2 internal cache and main memory............................................................................................. ................ 210 11-3 format of instruction cache line ........................................................................................... .................. 211 11-4 line format of data cache .................................................................................................. .................... 212
preliminary user?s manual u16044ej1v0um 18 list of figures (3/5) figure no. title page 11-5 index and data output of cache ............................................................................................. ................. 216 12-1 bus modes of v r 5500 ........................................................................................................................... .... 219 12-2 system interface bus (64-bit bus mode) ..................................................................................... ............. 220 12-3 system interface bus (32-bit bus mode) ..................................................................................... ............. 220 12-4 status of rdrdy#/wrrdy# signal of processor request ........................................................................ . 221 12-5 operation of system interface between registers ............................................................................ ....... 224 12-6 requests and system events................................................................................................. .................. 226 12-7 flow of processor requests ................................................................................................. .................... 227 12-8 flow of external request................................................................................................... ....................... 229 12-9 read response .............................................................................................................. .......................... 230 13-1 processor read request..................................................................................................... ..................... 239 13-2 processor non-block write request protocol ................................................................................. ......... 240 13-3 processor block write request .............................................................................................. .................. 240 13-4 control of processor request flow .......................................................................................... ................ 241 13-5 timing when second processor write request is delayed..................................................................... 2 42 13-6 timing of v r 4000-compatible back-to-back write cycle ......................................................................... 243 13-7 write re-issuance .......................................................................................................... .......................... 244 13-8 pipeline write............................................................................................................. ............................... 245 13-9 external request arbitration protocol...................................................................................... ................. 247 13-10 external null request protocol ............................................................................................ ..................... 248 13-11 external write request protocol........................................................................................... .................... 249 13-12 protocol of read request and read response ................................................................................ ....... 251 13-13 block read response in slave status ....................................................................................... .............. 251 13-14 read response with data rate pattern ddx .................................................................................. ......... 253 13-15 bit definition of system interface command ................................................................................ ............ 255 13-16 bit definition of syscmd bus during read request.......................................................................... ....... 256 13-17 bit definition of syscmd bus during write request......................................................................... ........ 257 13-18 bit definition of syscmd bus during null request .......................................................................... ......... 258 13-19 bit definition of system interface data identifier ........................................................................ .............. 258 14-1 processor read request..................................................................................................... ..................... 263 14-2 processor non-block write request protocol ................................................................................. ......... 264 14-3 processor block write request .............................................................................................. .................. 264 14-4 control of processor request flow .......................................................................................... ................ 265 14-5 timing when second processor write request is delayed..................................................................... 2 67 14-6 timing of v r 4000-compatible back-to-back write cycle ......................................................................... 268 14-7 write re-issuance .......................................................................................................... .......................... 269 14-8 pipeline write............................................................................................................. ............................... 270 14-9 external request arbitration protocol...................................................................................... ................. 272 14-10 external null request protocol ............................................................................................ ..................... 273 14-11 external write request protocol........................................................................................... .................... 274
preliminary user?s manual u16044ej1v0um 19 list of figures (4/5) figure no. title page 14-12 protocol of read request and read response................................................................................ ....... 276 14-13 block read response in slave status ....................................................................................... .............. 276 14-14 read response with data rate pattern ddx.................................................................................. ......... 278 14-15 bit definition of system interface command ................................................................................ ............ 283 14-16 bit definition of syscmd bus during read request .......................................................................... ...... 284 14-17 bit definition of syscmd bus during write request ......................................................................... ....... 285 14-18 bit definition of syscmd bus during null request.......................................................................... ......... 286 14-19 bit definition of system interface data identifier ........................................................................ .............. 286 15-1 successive read requests (in pipeline mode, with subsequent request)............................................. 293 15-2 successive read requests (in pipeline mode, without subsequent request)....................................... 294 15-3 successive read requests (in re-issuance mode) ............................................................................. ... 295 15-4 successive write requests (in pipeline mode)............................................................................... ......... 296 15-5 successive write requests (in re-issuance mode) ............................................................................ .... 297 15-6 write request following read request....................................................................................... ............ 298 15-7 bus arbitration of processor (in pipeline mode, with subsequent request) ............................................ 299 15-8 bus arbitration of processor (in pipeline mode, without subsequent request) ...................................... 300 15-9 bus arbitration of processor (in re-issuance mode) ......................................................................... ...... 301 15-10 single read request following block read request (in pipeline mode, with subsequent request).... 302 15-11 single read request following block read request (in pipeline mode, without subsequent request)................................................................................. ... 303 15-12 single read request following block read request (in re-issuance mode) ........................................ 304 15-13 unaligned 2-word read (in pipeline mode, with subsequent request) .................................................. 305 15-14 bit definition of system interface command ................................................................................ ............ 306 15-15 bit definition of syscmd bus during read request .......................................................................... ...... 307 15-16 bit definition of syscmd bus during write request ......................................................................... ....... 309 15-17 bit definition of syscmd bus during null request.......................................................................... ......... 311 15-18 bit definition of system interface data identifier ........................................................................ .............. 311 16-1 nmi# signal ................................................................................................................ .............................. 314 16-2 bits of interrupt register and enable bits................................................................................. ................ 316 16-3 hardware interrupt request signal .......................................................................................... ................ 317 16-4 masking interrupt signal................................................................................................... ........................ 318 17-1 cpu instruction opcode bit encoding........................................................................................ .............. 523 18-1 load/store instruction format .............................................................................................. .................... 527 18-2 operation instruction format............................................................................................... ..................... 528 18-3 fpu instruction opcode bit encoding ........................................................................................ .............. 613 20-1 example of connection of pll passive elements .............................................................................. ..... 616 21-1 access to processor resources in debug mode ................................................................................ ..... 618
preliminary user?s manual u16044ej1v0um 20 list of figures (5/5) figure no. title page 21-2 boundary scan register ..................................................................................................... ...................... 621 21-3 ie connection connector pin layout......................................................................................... ............... 623 21-4 debugging tool connection circuit example (when trace function is used) ........................................ 625 a-1 extracting data blocks in sequential order.................................................................................. ............ 626 a-2 extracting data in sub-block order .......................................................................................... ................ 627 b-1 example of recommended power supply circuit connection ................................................................. 629
preliminary user?s manual u16044ej1v0um 21 list of tables (1/4) table no. title page 1-1 cp0 registers ............................................................................................................... ............................. 32 2-1 system interface signals.................................................................................................... ........................ 43 2-2 initialization interface signals ............................................................................................ ......................... 44 2-3 interrupt interface signals ................................................................................................. ......................... 46 2-4 clock interface signals..................................................................................................... .......................... 46 2-5 power supply ................................................................................................................ ............................. 46 2-6 test interface signals...................................................................................................... ........................... 47 3-1 load/store instructions using register + offset addressing mode ........................................................... 53 3-2 load/store instructions using register + register addressing mode........................................................ 53 3-3 definition and usage of coprocessors by mips architecture ................................................................... .56 3-4 rotate instructions......................................................................................................... ............................. 57 3-5 macc instructions........................................................................................................... ........................... 58 3-6 sum-of-products instructions ................................................................................................ ..................... 58 3-7 register scan instructions.................................................................................................. ........................ 59 3-8 floating-point load/store instructions ...................................................................................... ................. 59 3-9 coprocessor 0 instructions.................................................................................................. ....................... 59 3-10 special instructions ....................................................................................................... ............................. 60 3-11 instruction function changes in v r 5500.................................................................................................... 60 3-12 load/store instructions.................................................................................................... ........................... 61 3-13 load/store instructions (extended isa) ..................................................................................... ................ 62 3-14 alu immediate instructions ................................................................................................. ...................... 63 3-15 alu immediate instructions (extended isa) .................................................................................. ............ 64 3-16 three-operand type instructions............................................................................................ ................... 64 3-17 three-operand type instructions (extended isa) ............................................................................. ........ 65 3-18 shift instructions ......................................................................................................... ................................ 65 3-19 shift instructions (extended isa).......................................................................................... ...................... 66 3-20 rotate instructions (for v r 5500)................................................................................................................ 67 3-21 multiply/divide instructions ............................................................................................... .......................... 68 3-22 multiply/divide instructions (extended isa)................................................................................ ................ 68 3-23 macc instructions (for v r 5500)................................................................................................................ 69 3-24 sum-of-products instructions (for v r 5500) ............................................................................................... 71 3-25 number of cycles for multiply and divide instructions ...................................................................... ......... 71 3-26 register scan instructions (for v r 5500).................................................................................................... 72 3-27 jump instruction ........................................................................................................... .............................. 72 3-28 branch instructions........................................................................................................ ............................. 73 3-29 branch instructions (extended isa) ......................................................................................... .................. 74 3-30 special instructions ....................................................................................................... ............................. 75 3-31 special instructions (extended isa) ........................................................................................ ................... 75 3-32 special instructions (for v r 5500) .............................................................................................................. 76 3-33 coprocessor instructions................................................................................................... ......................... 77 3-34 coprocessor instructions (extended isa) .................................................................................... .............. 78 3-35 system control coprocessor (cp0) instructions .............................................................................. .......... 78
preliminary user?s manual u16044ej1v0um 22 list of tables (2/4) table no. title page 3-36 system control coprocessor (cp0) instructions (for v r 5500) .................................................................. 79 5-1 operating modes ............................................................................................................. ........................... 88 5-2 instruction set modes ....................................................................................................... .......................... 89 5-3 addressing modes ............................................................................................................ .......................... 89 5-4 32-bit and 64-bit user mode segments........................................................................................ ............ 100 5-5 32-bit and 64-bit supervisor mode segments .................................................................................. ........ 103 5-6 32-bit kernel mode segments ................................................................................................. ................. 107 5-7 64-bit kernel mode segments ................................................................................................. ................. 108 5-8 cache algorithm and xkphys address space.................................................................................... ....... 109 5-9 cp0 memory management registers ............................................................................................. .......... 111 5-10 cache algorithm ............................................................................................................ ........................... 114 5-11 mask values and page sizes ................................................................................................. .................. 115 6-1 cp0 exception processing registers .......................................................................................... ............. 124 6-2 exception codes............................................................................................................. .......................... 132 6-3 events to count ............................................................................................................. ........................... 137 6-4 32-bit mode exception vector addresses ...................................................................................... .......... 143 6-5 64-bit mode exception vector addresses ...................................................................................... .......... 143 6-6 tlb refill exception vector ................................................................................................. ..................... 145 6-7 exception priority order.................................................................................................... ........................ 146 7-1 fcr......................................................................................................................... .................................. 172 7-2 flush value of denormalized number result ................................................................................... ........ 174 7-3 rounding mode control bits .................................................................................................. ................... 175 7-4 calculation expression of floating-point value .............................................................................. .......... 179 7-5 floating-point format and parameter value................................................................................... .......... 179 7-6 maximum and minimum values of floating point ................................................................................ ..... 179 7-7 load/store/transfer instructions............................................................................................ ................... 183 7-8 conversion instructions ..................................................................................................... ....................... 185 7-9 operation instructions...................................................................................................... ......................... 187 7-10 comparison instruction ..................................................................................................... ........................ 189 7-11 conditions for comparison instruction...................................................................................... ................ 189 7-12 fpu branch instructions .................................................................................................... ....................... 190 7-13 prefetch instruction ....................................................................................................... ............................ 190 7-14 conditional transfer instructions .......................................................................................... .................... 190 7-15 number of execution cycles of floating-point instructions .................................................................. .... 191 8-1 default values of ieee754 exceptions in fpu ................................................................................. ........ 195 8-2 fpu internal result and flag status ......................................................................................... ............... 195 12-1 system interface bus data .................................................................................................. ..................... 223 12-2 operation in case of load miss............................................................................................. ................... 231 12-3 operation in case of store miss ............................................................................................ ................... 232
preliminary user?s manual u16044ej1v0um 23 list of tables (3/4) table no. title page 12-4 error check for internal transaction ....................................................................................... ................. 236 12-5 error check for external transaction ....................................................................................... ................ 236 13-1 transfer data rate and data pattern ........................................................................................ ............... 253 13-2 code of system interface command syscmd(7:5)............................................................................... ... 255 13-3 code of syscmd(4:3) during read request.................................................................................... ........ 256 13-4 code of syscmd(2:0) during block read request.............................................................................. .... 256 13-5 code of syscmd(2:0) during single read request............................................................................. .... 256 13-6 code of syscmd(4:3) during write request................................................................................... ......... 257 13-7 code of syscmd(2:0) during block write request ............................................................................. ..... 257 13-8 code of syscmd(2:0) during single write request............................................................................ ..... 257 13-9 code of syscmd(4:3) during null request .................................................................................... .......... 258 13-10 codes of syscmd(7:5) of processor data identifier......................................................................... ........ 259 13-11 codes of syscmd(7:4) of external data identifier.......................................................................... .......... 259 14-1 transfer data rate and data pattern ........................................................................................ ............... 278 14-2 data write sequence ........................................................................................................ ....................... 279 14-3 data read sequence ......................................................................................................... ...................... 280 14-4 code of system interface command syscmd(7:5)............................................................................... ... 283 14-5 code of syscmd(4:3) during read request.................................................................................... ........ 284 14-6 code of syscmd(2:0) during block read request.............................................................................. .... 284 14-7 code of syscmd(2:0) during single read request............................................................................. .... 284 14-8 code of syscmd(4:3) during write request................................................................................... ......... 285 14-9 code of syscmd(2:0) during block write request ............................................................................. ..... 285 14-10 code of syscmd(2:0) during single write request........................................................................... ...... 285 14-11 code of syscmd(4:3) during null request ................................................................................... ........... 286 14-12 codes of syscmd(7:5) of processor data identifier......................................................................... ........ 287 14-13 codes of syscmd(7:4) of external data identifier.......................................................................... .......... 287 15-1 system interface bus data.................................................................................................. ..................... 292 15-2 code of system interface command syscmd(7:5)............................................................................... ... 306 15-3 code of syscmd(4:3) during read request.................................................................................... ........ 307 15-4 code of syscmd(2:0) during block read request.............................................................................. .... 308 15-5 code of syscmd(2:0) during single read request............................................................................. .... 308 15-6 code of syscmd(4:3) during write request................................................................................... ......... 309 15-7 code of syscmd(2:0) during block write request ............................................................................. ..... 310 15-8 code of syscmd(2:0) during single write request............................................................................ ..... 310 15-9 code of syscmd(4:3) during null request .................................................................................... .......... 311 15-10 codes of syscmd(7:5) of processor data identifier......................................................................... ........ 312 15-11 codes of syscmd(7:4) of external data identifier.......................................................................... .......... 312 15-12 code of request identifier sysid0 ......................................................................................... .................. 313 15-13 code of sysid(2:1) during instruction read ................................................................................ ............ 313 15-14 code of sysid(2:1) during data read ....................................................................................... .............. 313
preliminary user?s manual u16044ej1v0um 24 list of tables (4/4) table no. title page 17-1 cpu instruction operation notations ........................................................................................ ................ 320 17-2 load and store common functions ............................................................................................ ............. 321 17-3 access type specifications for loads/stores................................................................................ ........... 322 18-1 format field code .......................................................................................................... .......................... 528 18-2 valid format of fpu instruction ............................................................................................ .................... 529 18-3 load and store common functions ............................................................................................ ............. 532 18-4 logical inversion of term depending on true/false of condition............................................................ 5 34 19-1 instruction hazard of v r 5500.................................................................................................................... 615 21-1 test interface signals ..................................................................................................... .......................... 619 21-2 boundary scan sequence ..................................................................................................... ................... 622 21-3 ie connector pin functions ................................................................................................. ..................... 624 a-1 transfer sequence by sub-block ordering: where start address is 10 2 ................................................ 628 a-2 transfer sequence by sub-block ordering: where start address is 11 2 ................................................ 628 a-3 transfer sequence by sub-block ordering: where start address is 01 2 ................................................ 628
preliminary user?s manual u16044ej1v0um 25 chapter 1 general this chapter outlines the risc 64-/32-bit microprocessor v r 5500 ( pd30550). 1.1 features the v r 5500 is one of nec?s vr series microprocessors. it is a high-performance 64-/32-bit microprocessor employing the risc (reduced instruction set computer) architecture developed by mips tm . a bus width of 64 bits or 32 bits can be selected for the system interface of the v r 5500, which operates with a protocol compatible with the v r 5000 series tm and v r 5432. the v r 5500 has the following features. ? maximum operating frequency: internal: 400 mhz, 300 mhz, external: 133 mhz ? internal operating frequency obtained from the external operating clock (input clock and clock for bus interface) through multiplication. the multiplication rate can be selected from 2, 2.5, 3, 3.5, 4, 4.5, or 5.5 at reset. ? 64-bit architecture for 64-bit data processing ? 2-way superscalar pipeline parallel processing by six execution units (alu0, alu1, fpu, fpu/mac, bru, and lsu) ? employment of out-of-order mechanism ? branch prediction mechanism branch history table with 4k entries reduces branch delay. ? virtual address management by high-speed translation lookaside buffer (tlb) (48 double entries) ? address space physical: 36 bits (with 64-bit bus) 32 bits (with 32-bit bus) virtual: 40 bits (in 64-bit mode) 31 bits (in 32-bit mode) ? internal cache memory 2-way set associative with line lock function instruction: 32 kb data: 32 kb, non-blocking structure. write method can be selected from writeback and write through. ? 64-/32-bit address/data multiplexed bus the bus width is selected at reset. compatible with the bus protocol of existing products 64-bit bus: compatible with bus protocol of v r 5000 series 32-bit bus: compatible with bus protocol of v r 5432 (native mode) or rm523x note out-of-order return mode can be selected for each bus width. note product of pmc-sierra
chapter 1 general preliminary user?s manual u16044ej1v0um 26 ? internal transaction buffer ? internal floating-point unit ? hardware debug function (n-wire) ? conforms to mips i, ii, iii, and iv instruction sets. also supports sum-of-products instructions, rotate instructions, register scan instructions, and low-power mode instructions. ? support of standby mode to reduce power consumption during standby ? supply voltage core block: v dd = 1.5 v 5% (300 mhz model), 1.6 to 1.7 v (400 mhz model) i/o block: v dd io = 3.3 v 5%, 2.5 v 5% 1.2 ordering information part number package internal maximum operating frequency pd30550f2-300-nn1 272-pin plastic bga (c/d advanced type) (29 29) 300 mhz pd30550f2-400-nn1 272-pin plastic bga (c/d advanced type) (29 29) 400 mhz 1.3 v r 5500 processor all the internal structures of the v r 5500 such as the operation units, register files, and data bus, are 64 bits wide. however, the v r 5500 can also execute 32-bit applications even when it operates as a 64-bit microprocessor. the v r 5500 manages instruction execution by using a 2-way superscalar, high-performance pipeline, and realizes out-of-order processing by using six execution units. ?out-of-order? is a method that executes two or more instructions in a queue according to their execution readiness, independent of the program sequence. the hardware detects the dependency relationship of registers and delay due to load/branch, and locates and processes resources so that there is no gap in the pipeline. the execution result is output (i.e., written back to memory) in the program sequence. figure 1-1 shows the internal block diagram of the v r 5500. the v r 5500 consists of 11 main units.
chapter 1 general preliminary user?s manual u16044ej1v0um 27 figure 1-1. internal block diagram sysad bus (64/32 bits) siu sysclock v r 5500 cp0 clock generator fpu/ macu lsu ifu exu fpu icu test interface instruction cache dcu data cache alu0 wtb imq rnrf rs rcu control signals rb tlb bht ras rf sb bru alu1
chapter 1 general preliminary user ? s manual u16044ej1v0um 28 1.3.1 internal block configuration (1) instruction cache the instruction cache uses a 2-way set associative, virtual index, physical tag system and enables line-locking. the capacity is 32 kb. the cache is replaced by the lru (least recently used) method. the line size is 32 bytes (8 words). (2) instruction fetch unit (ifu) this unit fetches an instruction from the instruction cache, stores it once in an instruction management queue (imq) of 16 entries, and then transfers it to an instruction control unit (icu). up to two instructions are fetched and transferred per cycle. the ifu also has a branch prediction mechanism and a branch history table (bht) of 4096 entries so that instructions can continue to be fetched speculatively. moreover, one return address stack (ras) entry is provided so that exiting from a subroutine is speculatively processed. (3) instruction control unit (icu) this unit controls out-of-order execution of instructions. it renames registers to reduce the hazards caused by the dependency relationship of registers, when an instruction is transferred from the ifu. the instruction is then stored in a reservation station (rs) of 20 entries until it is ready for execution. when execution is ready, up to two instructions are taken out from the rs and are transferred to the execution unit (exu). (4) register control unit (rcu) this unit has a register file (rf) and a renaming register file (rnrf). rf consists of sixty-four 64-bit registers, and rnrf consists of sixteen 64-bit registers. these registers serve as source and destination registers when an instruction is executed. when instruction execution is complete, the rcu transfers the contents of rnrf to rf in accordance with the renaming by the icu, and completes instruction execution (commits). up to three instructions can be committed per cycle. (5) execution unit (exu) this unit consists of the following six sub-units. ? alu0: 64-bit integer operation unit ? alu1: 64-bit integer operation unit ? fpu/mac: 64-/32-bit floating-point operation unit and sum-of-products operation unit (floating-point multiplication and sum-of-products operations, integer multiplication, sum-of-products, and division operations) ? fpu: 64-/32-bit floating-point operation unit ? bru: branch unit ? lsu: load/store unit
chapter 1 general preliminary user?s manual u16044ej1v0um 29 (6) data cache control unit (dcu) this unit controls transactions to the data cache and replacement of cache lines. it has a refill buffer (rb) and store buffer (sb) with four entries each, and can process a non-blocking cache operation of up to four accesses. the dcu also supports functions such as uncached load/store, completion of transaction in the order of issuance, and data transfer from sb to the data cache by instruction execution commitment. (7) data cache the data cache uses a two-way set associative, virtual index, physical tag system, and enables line-locking. the capacity is 32 kb. the cache is replaced by the lru (least recently used) method. write method can be selected from writeback and write through. the line size is 32 bytes (8 words). (8) coprocessor 0 (cp0) cp0 manages memory, processes exceptions, and monitors the performance. for memory management, it protects access to various operation modes (user, supervisor, and kernel), memory segments, and memory pages. virtual addresses are translated by a translation lookaside buffer (tlb). the tlb is a full-associative type and has 48 entries. each entry can be mapped in page sizes of 4 kb to 1 gb. the coprocessor performs control when an interrupt or exception occurs as exception processing. it counts the number of times an event has occurred to check if instruction execution is efficient in order to monitor the performance. (9) system interface unit (siu) the sysad bus realizes interfacing with an external agent. this bus is a 64-/32-bit address/data multiplexed bus and is compatible with the v r 5000 series. to enhance the bus efficiency, four 64-bit write transaction buffers (wtbs) are provided. the siu also supports an uncached accelerated store operation, so that consecutive single write accesses are combined into one block write access. (10) clock generator the clock generator generates a clock for the pipeline from an externally input clock. the frequency ratio can be selected from 1:2, 1:2.5, 1:3, 1:3.5, 1:4, 1:4.5, 1:5, and 1:5.5. (11) test interface this interface connects an external debugging tool. it conforms to the n-wire specification and controls testing and debugging of the processor by using jtag interface signals and debug interface signals.
chapter 1 general preliminary user ? s manual u16044ej1v0um 30 1.3.2 cpu registers the v r 5500 has the following registers. ? general-purpose registers (gpr): 64 bits 32 in addition, the processor provides the following special registers. ? pc: program counter (64 bits) ? hi register: contains the integer multiply and divide higher doubleword result (64 bits) ? lo register: contains the integer multiply and divide lower doubleword result (64 bits) two of the general-purpose registers have assigned the following functions. ? r0: since it is fixed to zero, it can be used as the target register for any instruction whose result is to be discarded. r0 can also be used as a source when a zero value is needed. ? r31: the link register used by the jal/jalr instruction. this register can be used for other instructions. however, be careful that use of the register by the jal/jalr instruction does not coincide with use of the register for other operations. the register group is provided in the cp0 (system control coprocessor), to process exceptions and to manage addresses and in the fpu (floating-point unit) used for the floating-point operation. cpu registers can operate as either 32-bit or 64-bit registers, depending on the processor ? s operation mode. the operation of the cpu register differs depending on what instructions are executed: 32-bit instructions or mips16 instructions. figure 1-2 shows the cpu registers.
chapter 1 general preliminary user?s manual u16044ej1v0um 31 figure 1-2. cpu registers general-purpose registers r0 = 0 0 multiply/divide register 63 hi 0 63 lo program counter 0 pc 0 63 63 r1 r2 ? ? ? ? r29 r30 r31 (link address) the v r 5500 has no program status word (psw) register; this is covered by the status and cause registers incorporated within the system control coprocessor (cp0). for details of the cp0 registers, refer to 1.3.4 system control coprocessors (cp0). 1.3.3 coprocessors isa of mips defines that up to four coprocessors (cp0 to cp3) can be used. of these, cp0 is defined as a system control coprocessor, and cp1 is defined as a floating-point unit. cp2 and cp3 are reserved for future expansion.
chapter 1 general preliminary user?s manual u16044ej1v0um 32 1.3.4 system control coprocessors (cp0) cp0 translates virtual addresses to physical addresses, switches the operating mode (kernel, supervisor, or user mode), and manages exceptions. it also controls the cache subsystem to analyze a cause and to return from the error state. table 1-1 shows a list of the cp0 registers. for details of the registers related to the virtual system memory, refer to chapter 5 memory management system . for details of the registers related to exception handling, refer to chapter 6 exception processing . table 1-1. cp0 registers register number register name usage description 0 index memory management programmable pointer to tlb array 1 random memory management pseudo-random pointer to tlb array (read only) 2 entrylo0 memory management lower half of tlb entry for even vpn 3 entrylo1 memory management lower half of tlb entry for odd vpn 4 context exception processing pointer to virtual pte table in 32-bit mode 5 pagemask memory management page size specification 6 wired memory management number of wired tlb entries 7 ?? reserved 8 badvaddr memory management display of virtual address where the most recent error occurred 9 count exception processing timer count 10 entryhi memory management higher half of tlb entry (including asid) 11 compare exception processing timer compare value 12 status exception processing operation status setting 13 cause exception processing display of cause of the most recent exception occurred 14 epc exception processing exception program counter 15 prid memory management processor revision id 16 config memory management memory system m ode setting 17 lladdr memory management display of address of the ll instruction 18 watchlo exception processing memory reference trap address lower bits 19 watchhi exception processing memory reference trap address higher bits 20 xcontext exception processing pointer to virtual pte table in 64-bit mode 21 to 24 ?? reserved 25 performance counter exception processing count and control of performances 26 parity error exception processing cache parity bits 27 cache error exception processing cache error and status register 28 taglo memory management lower half of cache tag 29 taghi memory management higher half of cache tag 30 errorepc exception processing error exception program counter 31 ?? reserved
chapter 1 general preliminary user ? s manual u16044ej1v0um 33 1.3.5 floating-point unit the floating-point unit (fpu) executes floating-point operations. the fpu of the v r 5500 conforms to ansi/ieee standard 754-1985 ? ieee2 floating-point operation standard ? . the fpu can perform an operation with both single-precision (32-bit) and double-precision (64-bit) values. the fpu has the following registers. ? floating-point general-purpose register (fgr): 64/32 bits 32 ? floating-point control register (fcr): 32 bits 32 the number of bits of the fgr can be changed depending on the setting of the fr bit of the status register in cp0. if the number of bits is set to 32, sixteen 64-bit fgrs can be used for floating-point operations. if it is set to 64 bits, thirty-two 64-bit registers can be used. of the 32 fcrs, only five can be used. figure 1-3 shows the fpu registers. figure 1-3. fpu registers 63 0 fgr0 fgr1 fgr2 fgr29 fgr30 fgr31 floating-point general-purpose registers 31 0 fcr0 (implementation/revision) reserved fcr25 (condition code) fcr26 (cause/flag) reserved fcr28 (enable/mode) reserved fcr31 (control/status) floating-point control registers like the cpu, the fpu uses an instruction set with a load/store architecture. a floating-point operation can be started in each cycle. the load instructions of the fpu include r-type instructions. for details of the fpu, refer to chapter 7 floating-point unit and chapter 8 floating-point exceptions . 1.3.6 cache memory the v r 5500 has an internal instruction cache and data cache to enhance the efficiency of the pipeline. the instruction cache and data cache can be accessed in parallel. both the instruction cache and data cache have a data width of 64 bits and a capacity of 32 kb, and are managed by a two-way set associative method. for details of the caches, refer to chapter 11 cache memory .
chapter 1 general preliminary user ? s manual u16044ej1v0um 34 1.4 outline of instruction set all the instructions are 32 bits long. the instructions come in three types as shown in figure 1-4: immediate (i- type), jump (j-type), and register (r-type). figure 1-4. instruction type 31 26 25 21 20 16 15 0 op rs rt immediate i-type (immediate) 31 26 25 0 op target j-type (jump) 31 26 25 21 20 16 15 0 op rs rt sa r-type (register) 11 10 6 5 rd funct the instructions are classified into the following six groups. (1) load/store instructions transfer data between memory and a general-purpose register. most of these instructions are i-type. the addressing mode is in the format in which a 16-bit signed offset is added to the base register. some load/store instructions are index-type instructions that use floating-point registers (r-type). (2) arithmetic operation instructions execute an arithmetic operation, logical operation, shift manipulation, or multiplication/division on register values. the instruction types of these instructions are r-type (both the operand and the result of the operation are stored in registers) and i-type (one of the operands is a 16-bit signed immediate value). (3) jump/branch instructions change the flow of program control. a jump instruction jumps to an address that is generated by combining a 26-bit target address and the higher bits of the program counter (j-type), or to an address indicated by a register (r-type). a branch instruction branches to a 16-bit offset address relative to the program counter (i-type). the jump and link instruction saves the return address to register 31. (4) coprocessor instructions execute the operations of the coprocessor. the load and store instructions of the coprocessor are i-type instructions. the format of the operation instruction of a coprocessor differs depending on the coprocessor (refer to chapter 7 floating-point unit ). (5) system control coprocessor instructions execute operations on the cp0 register to manage the memory of the processor and to process exceptions. (6) special instructions execute system call exceptions and breakpoint exceptions. in addition, they branch to a general-purpose exception processing vector depending on the result of comparison. the instruction types are r-type and i-type. for each instruction, refer to chapter 3 outline of instruction set , chapter 17 cpu instruction set , and chapter 18 fpu instruction set .
chapter 1 general preliminary user ? s manual u16044ej1v0um 35 1.5 data format and addressing the v r 5500 has the following four types of data formats. doubleword (64 bits) word (32 bits) halfword (16 bits) byte (8 bits) if the data format is doubleword, word, or halfword, the byte order can be set to big endian or little endian by using the bigendian pin at reset. the endianness is defined by the position of byte 0 in the data structure of multiple bytes. in a big-endian system, byte 0 is the most significant byte (leftmost byte). this byte order is compatible with that employed for mc68000 tm and ibm370 tm . figure 1-5 shows the configuration. figure 1-5. byte address of big endian (a) word data 12 8 4 0 13 9 5 1 14 10 6 2 15 11 7 3 31 24 23 16 15 8 7 0 12 8 4 0 word address higher address lower address (b) doubleword data 16 8 0 17 9 1 18 10 2 19 11 3 63 0 16 8 0 doubleword address higher address lower address 20 12 4 21 13 5 22 14 6 23 15 7 32 31 16 15 8 7 word halfword byte remarks 1. the most significant byte is at the least significant address. 2. a word is specified by the address of the most significant byte.
chapter 1 general preliminary user ? s manual u16044ej1v0um 36 in a little-endian system, byte 0 is the least significant byte (rightmost byte). this byte order is compatible with that employed for pentium tm and dec vax tm . figure 1-6 shows the configuration. unless otherwise specified, little endian is used in this manual. figure 1-6. byte address of little endian (a) word data 15 11 7 3 14 10 6 2 13 9 5 1 12 8 4 0 31 24 23 16 15 8 7 0 12 8 4 0 word address higher address lower address (b) doubleword data 23 15 7 22 14 6 21 13 5 20 12 4 16 8 0 doubleword address higher address lower address 19 11 3 18 10 2 17 9 1 16 8 0 63 0 32 31 16 15 8 7 word halfword byte remarks 1. the least significant byte is at the least significant address. 2. a word is specified by the address of the least significant byte.
chapter 1 general preliminary user?s manual u16044ej1v0um 37 the cpu uses the following addresses to access halfwords, words, and doublewors. ? halfword: even-byte boundary (0, 2, 4 ?) ? word: 4-byte boundary (0, 4, 8 ?) ? doubleword: 8-byte boundary (0, 8, 16 ?) to load/store data that is not aligned at a 4-byte boundary (word) or 8-byte boundary (doubleword), the following dedicated instructions are used. ? word: lwl, lwr, swl, swr ? doubleword: ldl, ldr, sdl, sdr these instructions are always used in pairs of l and r. figure 1-7 illustrates how the word at byte address 3 is accessed. figure 1-7. byte address (unaligned word) (a) big endian 31 24 23 16 15 8 7 0 45 6 3 lower address higher address (b) little endian 31 24 23 16 15 8 7 0 65 4 3 higher address lower address
chapter 1 general preliminary user?s manual u16044ej1v0um 38 1.6 memory management system the v r 5500 can manage a physical address space of up to 64 gb (36 bits). most systems, however, are provided with a physical memory only in units of 1 gb or lower. therefore, the cpu translates addresses, allocates them to a vast virtual address space, and supplies the programmer with an extended memory space. for details of these address spaces, refer to chapter 5 memory management system . 1.6.1 high-speed translation lookaside buffer (tlb) tlb translates a virtual address into a physical address. it is of full-associative method and has 48 entries. each entry has consecutive two pages of mapping information. the page size can be changed from 4 kb to 1 gb in units of power of 4. (1) joint tlb (jtlb) this tlb holds both instruction addresses and data addresses. the higher bits of a virtual address (the number of bits depends on the size of the page) and a process identifier are combined and compared with each entry of jltb. if there is no matching entry in the tlb, an exception occurs, and the entry contents are written by software from a page table on memory to the tlb. the entry is determined by the value of the random register or index register. (2) micro tlb this tlb is for address translation in a cache. two micro tlbs, an instruction micro tlb and a data micro tlb, are available. each micro tlb has four entries and the contents of an entry can be loaded from the jtlb. however, loading to the micro tlb is performed internally and cannot be monitored by software. 1.6.2 processor modes (1) operating mode the v r 5500 has three operating modes: user, supervisor, and kernel. the memory mapping differs depending on the operating mode. for details, refer to chapter 5 memory management system . (2) addressing mode the v r 5500 has two addressing modes: 32-bit and 64-bit addressing. the address translation method and memory mapping differ depending on the addressing mode. for details, refer to chapter 5 memory management system . 1.7 instruction pipeline the v r 5500 has an instruction pipeline of up to 10 stages. it also has a mechanism that can simultaneously execute two instructions and thus can execute a floating-point operation instruction and an instruction of another type at the same time. for details, refer to chapter 4 pipeline . 1.7.1 branch prediction the v r 5500 has an internal branch prediction mechanism that accelerates branching. the branch history is recorded in a branch history table. the branch instruction that has been fetched is executed according to this table. the subsequent instructions are speculatively processed. for operations when branch prediction hits or misses, refer to chapter 4 pipeline .
preliminary user?s manual u16044ej1v0um 39 chapter 2 pin functions 2.1 pin configuration ? 272-pin plastic bga (c/d advanced type) (29 29) bottom view top view 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 aaywvutrpnmlk jhgfedcba abcdefghj klmnprtuvwyaa
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 40 (1/2) pin no. pin name pin no. pin name pin no. pin name pin no. pin name a1 v ss b17 sysad27 d12 v ss h4 v dd a2 v ss b18 v dd io d13 sysad31 h18 v ss a3 v dd io b19 v dd io d14 v dd h19 v ss a4 v dd io b20 v ss d15 sysad60 h20 v ss a5 reset# b21 v ss d16 v ss h21 sysad21 a6 preq# c1 v dd io d17 sysad26 j1 syscmd7 a7 validin# c2 v dd io d18 v ss j2 syscmd8 a8 validout# c3 v ss d19 v ss j3 tintsel a9 v ss c4 v ss d20 v dd io j4 int0# a10 sysadc7 c5 v ss d21 v dd io j18 sysad52 a11 sysadc3 c6 v dd e1 syscmd0 j19 sysad20 a12 sysadc1 c7 wrrdy# e2 disdvalido# j20 sysad51 a13 sysadc4 c8 v ss e3 dwbtrans# j21 sysad19 a14 sysad62 c9 sysid1 e4 o3return# k1 int1# a15 sysad30 c10 v dd e18 sysad57 k2 v ss a16 sysad28 c11 sysadc2 e19 sysad25 k3 v ss a17 sysad59 c12 v ss e20 sysad56 k4 v ss a18 v dd io c13 sysad63 e21 sysad24 k18 v dd a19 v dd io c14 v dd f1 syscmd1 k19 v dd a20 v ss c15 sysad29 f2 v ss k20 v dd a21 v ss c16 v ss f3 v ss k21 v dd b1 v ss c17 sysad58 f4 v ss l1 int2# b2 v ss c18 v dd io f18 v dd l2 int3# b3 v dd io c19 v ss f19 v dd l3 int4# b4 v dd io c20 v dd io f20 v dd l4 int5# b5 coldreset# c21 v dd io f21 sysad55 l18 sysad17 b6 release# d1 v dd io g1 syscmd2 l19 sysad49 b7 extrqst# d2 v dd io g2 syscmd3 l20 sysad18 b8 busmode d3 v ss g3 syscmd4 l21 sysad50 b9 sysid2 d4 v ss g4 syscmd5 m1 rmode#/bktgio# b10 v dd d5 ic g18 sysad23 m2 v dd b11 sysadc6 d6 v dd g19 sysad54 m3 v dd b12 v ss d7 rdrdy# g20 sysad22 m4 v dd b13 sysadc0 d8 v ss g21 sysad53 m18 v ss b14 v dd d9 sysid0 h1 syscmd6 m19 v ss b15 sysad61 d10 v dd h2 v dd m20 v ss b16 v ss d11 sysadc5 h3 v dd m21 v ss caution leave the ic pin open. remark # indicates active low.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 41 (2/2) pin no. pin name pin no. pin name pin no. pin name pin no. pin name n1 v dd io t21 sysad12 w2 v dd io y12 v dd n2 nmi# u1 ntrcclk w3 v ss y13 sysad3 n3 v dd io u2 ntrcdata0 w4 v ss y14 v ss n4 bigendian u3 ntrcdata1 w5 v dd pa2 y15 sysad37 n18 sysad15 u4 ntrcdata3 w6 v ss y16 sysad39 n19 sysad47 u18 sysad10 w7 v dd io y17 sysad40 n20 sysad16 u19 sysad42 w8 v dd y18 v dd io n21 sysad48 u20 sysad11 w9 jtdi y19 v dd io p1 v ss u21 sysad43 w10 v ss y20 v ss p2 v ss v1 ntrcdata2 w11 sysad1 y21 v ss p3 v ss v2 ntrcend w12 v dd aa1 v ss p4 v ss v3 v ss w13 sysad35 aa2 v ss p18 v dd v4 v ss w14 v ss aa3 v dd io p19 v dd v5 v ss pa2 w15 sysad38 aa4 v dd io p20 v dd v6 v ss w16 v dd aa5 v dd pa1 p21 sysad46 v7 v dd io w17 sysad9 aa6 v dd io r1 divmode0 v8 v dd w18 v ss aa7 ic r2 divmode1 v9 jtms w19 v ss aa8 jtdo r3 divmode2 v10 v ss w20 v dd io aa9 drvcon r4 v dd io v11 sysad33 w21 v dd io aa10 v ss r18 sysad44 v12 v dd y1 v ss aa11 sysad0 r19 sysad13 v13 sysad4 y2 v ss aa12 sysad2 r20 sysad45 v14 v ss y3 v dd io aa13 sysad34 r21 sysad14 v15 sysad7 y4 v dd io aa14 sysad36 t1 v dd v16 v dd y5 v ss pa1 aa15 sysad5 t2 v dd v17 sysad41 y6 sysclock aa16 sysad6 t3 v dd v18 v ss y7 jtrst# (v ss ) aa17 sysad8 t4 v dd v19 v ss y8 v dd aa18 v dd io t18 v ss v20 v dd io y9 jtck aa19 v dd io t19 v ss v21 v dd io y10 v ss aa20 v ss t20 v ss w1 v dd io y11 sysad32 aa21 v ss caution leave the ic pin open. remarks 1. inside the parentheses indicates the pin name in ver. 1.x. 2. # indicates active low.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 42 pin identification bigendian: big endian bktgio#: break/trigger input/output busmode: bus mode coldreset#: cold reset disdvalido#: disable delay validout# divmode(2:0): divide mode drvcon: driver control dwbtrans#: doubleword block transfer extrqst#: external request ic: internally connected int(5:0)#: interrupt jtck: jtag clock jtdi: jtag data input jtdo: jtag data output jtms: jtag mode select jtrst#: jtag reset nmi#: non-maskable interrupt ntrcclk: n-trace clock ntrcdata(3:0): n-trace data output ntrcend: n-trace end o3return#: out-of-order return mode preq#: processor request rdrdy#: read ready release#: release reset#: reset sysad(63:0): system address/data bus sysadc(7:0): system address/data check bus sysclock: system clock syscmd(8:0): system command/data identifier bus sysid(2:0): system bus identifier tintsel: timer interrupt selection validin#: valid input validout#: valid output v dd : power supply for cpu core v dd io: power supply for i/o v dd pa1, v dd pa2: quiet v dd for pll v ss : ground v ss pa1, v ss pa2: quiet v ss for pll wrrdy#: write ready remark # indicates active low.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 43 2.2 pin functions remark # indicates active low. 2.2.1 system interface signals these signals are used when the v r 5500 is connected to an external device in the system. table 2-1 shows the functions of these signals. table 2-1. system interface signals pin name i/o function sysad(63:0) i/o system address/data bus this is a 64-bit bus that establishes communication between the processor and external agent. the lower 32 bits (sysad(31:0)) of this bus are used in the 32-bit bus mode. sysadc(7:0) i/o system address/data check bus this is a parity bus for the sysad bus. it is valid only in the data cycle. the lower 4 bits (sysadc(3:0)) are used in the 32-bit bus mode. syscmd(8:0) i/o system command/data id bus this is a 9-bit bus that transfers commands and data identifiers between the processor and external agent. sysid(2:0) i/o system bus protocol id these signals transfer a request identifier in the out-of-order return mode. the processor drives the valid identifier when the validout# signal is asserted. the external agent must drive the valid identifier when the validin# signal is asserted. validin# input valid in this signal indicates that the external agent is driving a valid address or data onto the sysad bus or a valid command or data identifier onto the syscmd bus, or a valid request identifier onto the sysid bus in the out-of-order return mode. validout# output valid out this signal indicates that the processor is driving a valid address or data onto the sysad bus or a valid command or data identifier onto the syscmd bus, or a valid request identifier onto the sysid bus in the out-of-order return mode. rdrdy# input read ready this signal indicates that the external agent is ready to acknowledge a processor read request. wrrdy# input write ready this signal indicates that the external agent is ready to acknowledge a processor write request. extrqst# input external request this signal is used by the external agent to request the right to use the system interface. release# output release interface this signal indicates that the processor releases the system interface to the slave status. preq# output processor request this signal indicates that the processor has a pending request.
chapter 2 pin functions preliminary user?s manual u16044ej1v0um 44 2.2.2 initialization interface signals these signals are used by the external device to initialize the operation parameters of the processor. table 2-2 shows the functions of these signals. table 2-2. initialization interface signals (1/2) pin name i/o function divmode(2:0) input division mode these signals set the division ratio of pclock and sysclock. 111: divided by 5.5 110: divided by 5 101: divided by 4.5 100: divided by 4 011: divided by 3.5 010: divided by 3 001: divided by 2.5 000: divided by 2 set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. bigendian input endian mode this signal sets the byte order for addressing. 1: big endian 0: little endian set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. busmode input bus mode this signal sets the bus width of the system interface. 1: 64 bits 0: 32 bits set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. tintsel input interrupt source select this signal sets the interrupt source to be allocated to the ip7 bit of the cause register. 1: timer interrupt 0: int5# input and external write request (sysad5) set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. disdvalido# input validout# delay enable 1: validout# is active even while address cycle is stalled. 0: validout# is active only in the address issuance cycle. set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. remark 1: high level, 0: low level
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 45 table 2-2. initialization interface signals (2/2) pin name i/o function dwbtrans# input doubleword block transfer enable (valid only in 32-bit mode) 1: disabled 0: enabled set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. o3return# input out-of-order return mode this signal sets the protocol of the system interface. 1: normal mode 0: out-of-order return mode set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. coldreset# input cold reset this signal completely initializes the internal status of the processor. deassert this signal in synchronization with sysclock. reset# input reset this signal logically initializes the internal status of the processor. deassert this signal in synchronization with sysclock. drvcon input drive control this signal sets the impedance of the external output driver. 1: weak 0: normal (recommended) set the level of these signals before starting a power-on reset, and make sure that the level does not change during operation. remark this signal is used in ver. 2.0 or later. it is fixed to 0 in ver. 1.x. remark 1: high level, 0: low level the o3return#, dwbtrans#, disdvalido#, and busmode signals are used to determine the protocol of the system interface. these signals select the protocol as follows. protocol o3return# dwbtrans# disdvalido# busmode v r 5000-compatible 1 1 1 1 rm523x-compatible 1 1 1 0 v r 5432 native mode-compatible 1 0 0 0 out-of-order return mode 0 any any any remark 1: high level, 0: low level rm523x is a product of pmc-sierra.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 46 2.2.3 interrupt interface signals the external device uses these signals to send an interrupt request to the v r 5500. table 2-3 shows the functions of these signals. table 2-3. interrupt interface signals pin name i/o function int(5:0)# input interrupt these are general-purpose processor interrupt requests. the input status of these signals can be checked by the cause register. whether int5# is acknowledged is determined by the status of the tintsel signal at reset. nmi# input non-maskable interrupt this is an interrupt request that cannot be masked. 2.2.4 clock interface signals these signals are used to supply or manage the clock. table 2-4 shows the functions of these signals. table 2-4. clock interface signals pin name i/o function sysclock input system clock clock signal input to the processor. v dd pa1 v dd pa2 ? v dd for pll power supply for the internal pll. v ss pa1 v ss pa2 ? v ss for pll ground for the internal pll. 2.2.5 power supply table 2-5. power supply pin name i/o function v dd ? power supply pin for core v dd io ? power supply pin for i/o v ss ? ground pin caution the v r 5500 uses two power supplies. power can be applied to these power supplies in any order. however, do not allow a voltage to be applied to only one of the power supplies for 100 ms or more.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 47 2.2.6 test interface signal these signals are used to test the v r 5500. they include the jtag interface signals conforming to ieee standard 1149.1 and debug interface signals conforming to the n-wire specifications. table 2-6 shows the function of these signals. table 2-6. test interface signals pin name i/o function ntrcdata(3:0) output trace data trace data output. ntrcend output trace end this signal delimits (indicates the end of) a trace data packet. ntrcclk output trace clock this clock is for the test interface. the same clock as sysclock is output. rmode#/ bktgio# i/o reset mode/break trigger i/o this pin inputs a debug reset mode signal while the jtrst# signal (coldreset# signal in ver. 1.x) is active. it inputs/outputs a break or trigger signal during normal operation. jtdi input jtag data input serial data input for jtag. jtdo output jtag data output serial data output for jtag. this signal is output at the falling edge of jtck. jtms input jtag mode select this signal selects the jtag test mode. jtck input jtag clock input this is a serial clock input signal for jtag. the maximum frequency is 33 mhz. it is not necessary to synchronize this signal with sysclock. jtrst# input jtag reset input this signal is used to initialize the jtag test module. remark only ver. 2.0 or later
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 48 2.3 handling of unused pins 2.3.1 system interface pin (1) 32-bit bus mode in the v r 5500, the width of the sysad bus can be selected from 64 bits or 32 bits. when the 32-bit bus mode is selected, only the necessary system interface pins are selected and used. in the 32-bit bus mode, therefore, handle the pins that are not used, as follows. pin handling sysad(63:32) leave open sysadc(7:4) leave open (2) normal mode the v r 5500 in the out-of-order return mode can process read/write transactions regardless of the request issuance sequence. at this time, the sysid(2:0) pins are used to identify the request. these signals are not used in the normal mode and therefore must be handled as follows. pin handling sysid(2:0) leave open (3) parity bus the v r 5500 allows selection of whether to protect data by using parity or not. when parity is used, parity data is output from the processor or external agent to the sysadc bus. because whether parity is used or not is selected by software, however, the v r 5500 cannot determine the operation of the sysadc bus until the program is started. therefore, make sure that the sysadc bus is not left open nor goes into a high-impedance state. when it is known that parity will not be used in the system, it is recommended to connect each pin of the sysadc bus to v dd io via a high resistance.
chapter 2 pin functions preliminary user ? s manual u16044ej1v0um 49 2.3.2 test interface pins the v r 5500 can be tested and debugged with the device mounted on the board. the test interface pins are used to connect an external debugging tool. therefore, handle the test interface pins as follows when the debugging function is not used and in the normal operating mode. pin handling jtck pull up jtdi pull up jtms pull up jtrst# note pull down jtdo leave open ntrcclk leave open ntrcdata(3:0) leave open ntrcend leave open rmode#/bktgio# pull up note only ver. 2.0 or later
preliminary user?s manual u16044ej1v0um 50 chapter 3 outline of instruction set this chapter describes the architecture of the instruction set and outlines the cpu instruction set used for the v r 5500. 3.1 instruction set architecture the v r 5500 can execute the mips iv instruction set and additional instructions dedicated to the v r 5500. at present, five mips instruction set levels, levels i to v, are available. instruction sets with higher level numbers include instruction sets with lower level numbers (refer to figure 3-1 ). therefore, a processor having the mips v instruction set can execute the binary program of mips i, mips ii, mips iii, and mips iv without modification. figure 3-1. expansion of mips architecture mips i mips ii mips iii mips iv mips v the instructions used in the v r 5500 can be classified as follows. for operation details, refer to the corresponding chapter. ? cpu instructions (refer to 3.3 outline of cpu instruction set and chapter 17 cpu instruction set ) ? floating-point (fpu) instructions (refer to 7.5 outline of fpu instruction set and chapter 18 fpu instruction set )
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 51 3.1.1 instruction format all instructions are 1-word (32-bit) instructions and are located at the word boundary. three types of instruction formats are available as shown in figure 3-2. by simplifying the instruction formats to three, decoding instructions is simplified. operations and addressing modes that are complicated and not often used are realized by combining two or more instructions with a compiler. figure 3-2. instruction format op immediate 0 15 16 20 21 25 26 31 op target 0 25 26 31 rt rs op i type (immediate) j type (jump) r type (register) 0 15 16 20 21 25 26 31 sa rd 5 6 10 11 rt rs funct op: 6-bit operation code rs: 5-bit source register number rt: 5-bit target (source/destination) register number or branch condition immediate: 16-bit immediate value, branch displacement, or address displacement target: 26-bit unconditional branch target address rd: 5-bit destination register number sa: 5-bit shift amount funct: 6-bit function field
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 52 3.1.2 load/store instructions the load/store instructions transfer data between memory, the cpu, and the general-purpose registers of the coprocessor. these instructions are used to transfer fields of various sizes, treat loaded data as a signed or unsigned integer, access unaligned fields, select the addressing mode, and update the atomic memory (read-modify- write). a halfword, word, or doubleword address indicates the least significant byte of the bytes generating an object, regardless of the byte order (big endian or little endian). in big endian, this is the most significant byte; it is the least significant byte in little endian. with some exceptions, the load/store instructions must access an object that is naturally aligned. if an attempt is made to load/store an object at an address that is not even times greater than the size of the object, an address error exception occurs. new load/store operations have been added at each level of the architecture. mips ii ? 64-bit coprocessor transfer ? atomic update mips iii ? 64-bit cpu transfer ? loading unsigned word to cpu mips v ? register + register addressing mode of fpu remarks 1. the v r 5500 does not support an environment where two or more processors operate simultaneously. to maintain compatibility with the other vr series processors, however, the atomic update instructions of memory defined by mips ii isa (such as the load link instruction and conditional store instruction) operate correctly. the load link bit (ll bit) is set by the ll instruction, cleared by the eret instruction, and tested by the sc instruction. if the ll bit cannot be set because the cache has become invalid, it can be manipulated only when it is reset from an external source. 2. the sync instruction is processed as a nop instruction. the processor waits until all the instructions issued before the sync instruction are committed. therefore, an ll/sc instruction placed before and after the sync instruction can be executed in the program sequence. tables 3-1 and 3-2 show the supported load/store instructions and the level of the mips architecture at which each instruction is supported first.
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 53 table 3-1. load/store instructions using register + offset addressing mode cpu coprocessor (except 0) data size signed load unsigned load store load store byte i i i halfword i i i word i iii i i i doubleword iii iii ii ii unaligned word i i unaligned doubleword iii iii link word (atomic modify) ii ii link doubleword (atomic modify) iii iii table 3-2. load/store instructions using register + register addressing mode floating-point coprocessor only data size load store word iv iv doubleword iv iv (1) scheduling load delay slot the instruction position immediately after a load instruction is called a load delay slot. an instruction that contains a load destination register can be described in the load delay slot, but an interlock is generated for the required number of cycles. therefore, although any instruction description can be made, it is recommended to schedule the load delay slot from the viewpoints of improving performance and maintaining compatibility with the v r series. however, because the v r 5500 executes instructions by using an out-of-order mechanism, it can resolve a load delay even if scheduling is not made by software. (2) definition of access type the access type is the size of the data the processor loads/stores. the opcode of a load/store instruction determines the access type. figure 3-3 shows the access type and the data that is loaded/stored. the address used for a load/store instruction is the least significant byte address (address indicating the least significant byte in little endian), regardless of the access type and byte order (endianness). the byte order in the doubleword of the accessed data is determined by the access type and the lower 3 bits of the address, as shown in figure 3-3. combinations of the access type and the lower bits of the address other than those shown in figure 3-3 are prohibited (except for the luxc1 and suxc1 instructions). if such combinations are used, an address error exception occurs.
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 54 figure 3-3. byte specification related to load and store instructions access type (value) lower address bit accessed byte (big endian) accessed byte (little endian) 21063 063 0 doubleword (7) 0 0 0 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0 7-byte (6) 0000123456 6543210 001 12345677654321 6-byte (5) 000012345 543210 010 234567765432 5-byte (4) 00001234 43210 011 3456776543 word (3) 0000123 3210 100 45677654 3-byte (2) 000012 210 001 123 321 100 456 654 101 567765 halfword (1) 0 0 0 0 1 1 0 010 23 32 1004554 110 6776 byte (0) 0 0 0 0 0 001 1 1 010 2 2 011 3 3 100 4 4 101 5 5 110 6 6 111 77
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 55 3.1.3 operation instructions arithmetic operations of 2 ? s complement are executed using integers expressed as 2 ? s complement. signed addition, subtraction, multiplication, and division instructions are available. ? unsigned ? addition and subtraction instructions are also available but these are actually modulo operation instructions that do not detect overflow. unsigned multiplication and division instructions are also available, as are all shift and logical operation instructions. mips i executes a 32-bit arithmetic operation using 32-bit integers. mips iii can also execute arithmetic shift instructions using 64-bit operands as 64-bit integers have been added. the logical operations are not affected by the width of the registers. the operation instructions perform the following operations, using the value of registers. ? arithmetic operation ? multiplication ? logical operation ? division ? shift ? sum-of-products operation ? rotate ? counting 0/1 in data these operations are processed by the following six types of operation instructions. ? alu immediate instructions ? multiplication/division instructions ? 3-operand type instructions ? sum-of-products instructions ? shift/rotate instructions ? register scan instructions internally, the v r 5500 performs processing in 64-bit units. a 32-bit operand can also be used but must be sign- extended. the basic arithmetic and logical instructions such as add, addu, sub, subu, addi, sll, sra, and sllv can support 32-bit operands. if the operand is not correctly sign-extended, however, the operation is undefined. 32-bit data is sign-extended and stored in a 64-bit register. 3.1.4 jump/branch instructions all jump and branch instructions always have a delay slot of one instruction. the instruction immediately after a jump/branch instruction (instruction in the delay slot) is executed while the instruction at the destination is being fetched from the cache. the jump/branch instruction cannot be placed in a delay slot. even if it is placed, however, an error is not detected, and the execution result of the program is undefined. if execution of the instruction in a delay slot is aborted by the occurrence of an exception or interrupt, the virtual address of the jump/branch instruction immediately before is stored in the epc register. when the program returns from processing the exception or interrupt, both the jump/branch instruction and the instruction in its delay slot are re-executed. therefore, do not use register 31 (link address register) as the source register of the jump and link, and branch and link instructions. because an instruction must be placed at the word boundary, use a register in which an address whose lower bits are 0 is stored as the operand of the jr and jalr instructions. if the lower 2 bits of the address are not 0, an address error exception occurs when the destination of the instruction is fetched.
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 56 (1) outline of jump instructions to call a subroutine described in a high-level language, the j or jal instruction is usually used. the j and jal instructions are j-type instructions. this format shifts a 26-bit target address 2 bits to the left and combines the result with the higher 4 bits of the current program counter to generate an absolute address. usually, the jr or jalr instruction is used to exit, dispatch, or jump between pages. both these instructions are r-type and reference the 64-bit byte address of a general-purpose register. (2) outline of branch instructions the branch address of all the branch instructions is calculated by adding a 16-bit offset (signed 64 bits shifted 2 bits to the left) to the address of the instruction in the delay slot. all the branch instructions generate one delay slot. if the branch condition of the branch likely instruction is not satisfied, the instruction in the delay slot is invalid. the instruction in the delay slot is executed unconditionally for the other branch instructions. 3.1.5 special instructions the special instructions generate an exception by software unconditionally or conditionally. actually, system call, breakpoint, and trap exceptions occur in the processor. system calls and breakpoints are unconditionally executed, whereas a condition can be specified for a trap. the sync instruction is used to terminate all pending instructions. the v r 5500 executes the sync instruction as nop. 3.1.6 coprocessor instructions the coprocessor is an alternate execution unit that has a register file separated from the cpu. the mips architecture allows allocation of up to four coprocessors, 0 to 3. each architecture level defines these coprocessors as shown in table 3-3. coprocessor 0 is always used for system control, and coprocessor 1 is used as a floating- point unit. the other coprocessors are valid in terms of architecture but have no usage allocated. some coprocessors are undefined and their opcode is reserved or used for other purposes. table 3-3. definition and usage of coprocessors by mips architecture mips architecture level coprocessor i ii iii iv 0 system control system control system control system control 1 floating-point operation floating-point operation floating-point operation floating-point operation 2 unused unused unused unused 3 unused unused undefined floating-point operation (cop1x) a coprocessor has two register sets: coprocessor general-purpose registers and coprocessor control registers. each register set has up to 32 registers. depending on the operation instruction of the coprocessor, both the register sets may be changed.
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 57 all system control of a mips processor is provided as coprocessor 0 (cp0: system control processor). this coprocessor has processor control, memory management, and exception processing functions. the cp0 instructions are peculiar to each cpu. if the system has an internal floating-point unit, it is used as coprocessor 1 (cp1). with mips iv, the fpu uses the opcode space for coprocessor unit 3 as cop1x. for the fpu instructions, refer to 7.5 outline of fpu instruction set and chapter 18 fpu instruction set . the coprocessor instructions can be classified into the following two major groups. ? load/store instructions reserved for the main opcode space ? coprocessor-specific operations that are defined by the coprocessor (1) load/store for coprocessor no load/store instruction is defined for cp0. to read/write a cp0 register, therefore, only an instruction that transfers data to or from the coprocessor can be used. (2) coprocessor operation up to four coprocessors can be used. to which coprocessor an instruction belongs is indicated by z (z = 0 to 3) suffixed to the mnemonic. in the main opcode, the coprocessor has a coprocessor-specific coded instruction. 3.2 addition and modification of v r 5500 instructions the v r 5500 has additional instructions that can be used for multimedia applications, such as sum-of-products instructions and register scan instructions. these additional instructions are not included in the mips iv instruction set. in addition, mips isa makes instructions already defined available again and expands and changes functions. 3.2.1 integer rotate instructions integer rotate instructions have also been added to the v r 5500 in the same manner as the v r 5432. these instructions shift the value of a general-purpose register to the right by the number of bits specified by 5 bits of the instruction or by the number of bits specified by a register. the least significant bit that has been shifted is joined to the most significant bit, and the result is stored in the destination register. table 3-4. rotate instructions instruction definition dror doubleword rotate right dror32 doubleword rotate right + 32 drorv doubleword rotate right variable ror rotate right rorv rotate right variable
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 58 3.2.2 sum-of-products instructions sum-of-products instructions have also been added to the v r 5500 in the same manner as the v r 5432. these instructions add a value to the result of multiplication, using the hi register and lo register as an accumulator, and store the result in the destination register. the accumulator is 64 bits long with the lower 32 bits of the hi register as its higher bits and the lower 32 bits of the lo register as its lower bits. no overflow or underflow occurs as a result of executing these instructions. therefore, no exception occurs. in addition to the macc instruction added to the v r 5432, the v r 5500 also has a sum-of-products instruction that does not store the result in a general-purpose register, and a multiplication instruction that does not store the result in the hi or lo register. table 3-5. macc instructions instruction definition macc multiply, accumulate, and move lo macchi multiply, accumulate, and move hi macchiu unsigned multiply, accumulate, and move hi maccu unsigned multiply, accumulate, and move lo msac multiply, negate, accumulate, and move lo msachi multiply, negate, accumulate, and move hi msachiu unsigned multiply, negate, accumulate, and move hi msacu unsigned multiply, negate, accumulate, and move lo mul multiply and move lo mulhi multiply and move hi mulhiu unsigned multiply and move hi muls multiply, negate, and move lo mulshi multiply, negate, and move hi mulshiu unsigned multiply, negate, and move hi mulsu unsigned multiply, negate, and move lo mulu unsigned multiply and move lo table 3-6. sum-of-products instructions instruction definition madd multiply and add word maddu multiply and add word unsigned msub multiply and subtract word msubu multiply and subtract word unsigned mul64 multiply
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 59 3.2.3 register scan instructions register scan instructions have been added to the v r 5500. these instructions scan the contents of a general-purpose register and store the number of 0s or 1s of the register in the destination register. table 3-7. register scan instructions instruction definition clo count leading ones clz count leading zeros dclo count leading ones in doubleword dclz count leading zeros in doubleword 3.2.4 floating-point load/store instructions these instructions have been added to the v r 5500. they load/store data between a floating-point register and memory regardless of whether data is aligned or not. table 3-8. floating-point load/store instructions instruction definition luxc1 load doubleword indexed unaligned suxc1 store doubleword indexed unaligned 3.2.5 other additional instructions coprocessor 0 branch instructions are not supported by the v r 5000 series but they are available in the v r 5500 again. in addition, an instruction that is used to manipulate the contents of the performance counter in coprocessor 0, and a nop instruction that synchronizes the superscalar pipeline are also provided. the standby mode instructions supported by the v r 5000 are also provided in the v r 5500. table 3-9. coprocessor 0 instructions instruction definition bc0t branch on coprocessor 0 true bc0f branch on coprocessor 0 false bc0tl branch on coprocessor 0 true likely bc0fl branch on coprocessor 0 false likely mtpc move to performance counter mfpc move from performance counter mtps move to performance event specifier mfps move from performance event specifier
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 60 table 3-10. special instructions instruction definition ssnop superscalar nop wait wait 3.2.6 instructions for which functions and operations were changed functions and operations have been changed in the following instructions. table 3-11. instruction function changes in v r 5500 instruction major changed points cache in fill and fetch_and_lock operation, the way to be replaced is selected based on the lru bit of the cache tag. tlbp (compatible with mips64) tlbr (compatible with v r 5000 series) sc the ll bit is not changed note scd sync the sync instruction is executed after all the on-going instructions complete the commit stage. note in the v r 5432, the ll bit is cleared when the sc/scd instruction is executed. 3.3 outline of cpu instruction set 3.3.1 load and store instructions load and store are i-type instructions that transfer data between memory and general-purpose registers. the only addressing mode that load and store instructions directly support is the mode to add a signed 16-bit signed immediate offset to the base register.
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 61 table 3-12. load/store instructions instruction format and description load byte lb rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the bytes specified by the address are sign-extended and loaded to register rt . load byte unsigned lbu rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the bytes specified by the address are zero-extended and loaded to register rt . load halfword lh rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the halfword specified by the address are sign-extended and loaded to register rt . load halfword unsigned lhu rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the halfword specified by the address are zero-extended and loaded to register rt . load word lw rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the word specified by the address is loaded to register rt . in the 64-bit mode, it is sign-extended. load word left lwl rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the word whose address is specified is shifted to the left so that the address-specified byte is at the left-most position of the word. the result is merged with the contents of register rt and loaded to register rt . in the 64-bit mode, it is sign-extended. load word right lwr rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the word whose address is specified is shifted to the right so that the address-specified byte is at the right- most position of the word. the result is merged with the contents of register rt and loaded to register rt . in the 64-bit mode, it is sign extended. store byte sb rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the least significant byte of register rt is stored in the memory location specified by the address. store halfword sh rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the least significant halfword of register rt is stored in the memory location specified by the address. store word sw rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the lower word of register rt is stored in the memory location specified by the address. store word left swl rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of register rt is shifted to the right so that the left-most byte of the word is in the position of the address-specified byte. the result is stored in the lower word in memory. store word right swr rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of register rt is shifted to the left so that the right-most byte of the word is in the position of the address-specified byte. the result is stored in the upper word in memory. op b ase rt offset
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 62 table 3-13. load/store instructions (extended isa) instruction format and description load doubleword ld rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the doubleword specified by the address are loaded to register rt . load doubleword left ldl rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the doubleword whose address is specified is shifted to the left so that the address-specified byte is at the left-most position of the doubleword. the result is merged with the contents of register rt and loaded to register rt . load doubleword right ldr rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the doubleword whose address is specified is shifted to the right so that the address-specified byte is at the right-most position of the doubleword. the result is merged with the contents of register rt and loaded to register rt . load linked ll rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the word specified by the address are loaded to register rt and the ll bit is set to 1. load linked doubleword lld rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the doubleword specified by the address are loaded to register rt and the ll bit is set to 1. load word unsigned lwu rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of the word specified by the address are zero-extended and loaded to register rt . store conditional sc rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. if the ll bit is set to 1, the contents of the lower word of register rt are stored in the memory specified by the address, and register rt is set to 1. if the ll bit is set to 0, the store operation is not performed and register rt is cleared to 0. store conditional doubleword scd rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. if the ll bit is set to 1, the contents of register rt are stored in the memory specified by the address, and register rt is set to 1. if the ll bit is set to 0, the store operation is not performed and register rt is cleared to 0. store doubleword sd rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of register rt are stored in the memory specified by the address. store doubleword left sdl rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of register rt is shifted to the right so that the left-most byte of the doubleword is in the position of the address-specified byte. the result is stored in the lower doubleword in memory. store doubleword right sdr rt, offset (base) the sign-extended offset is added to the contents of register base to generate an address. the contents of register rt is shifted to the left so that the right-most byte of the doubleword is in the position of the address-specified byte. the result is stored in the higher doubleword in memory. op b ase rt offset
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 63 3.3.2 computational instructions computational instructions perform arithmetic, logical, and shift operations on values in registers. computational instructions can be either in register (r-type) format, in which both operands are registers, or in immediate (i-type) format, in which one operand is a 16-bit immediate. computational instructions are classified as: (1) alu immediate instructions (2) three-operand type instructions (3) shift/rotate instructions (4) multiply/divide instructions (5) sum-of-products instructions (6) register scan instructions table 3-14. alu immediate instructions instruction format and description add immediate addi rt, rs, immediate the 16-bit immediate is sign-extended and added to the contents of register rs . the 32-bit result is stored in register rt . in the 64-bit mode, it is sign-extended. an exception occurs on the generation of 2's complement overflow. add immediate unsigned addiu rt, rs, immediate the 16-bit immediate is sign-extended and added to the contents of register rs . the 32-bit result is stored in register rt . in the 64-bit mode, it is sign extended. no exception occurs on the generation of overflow. set on less than immediate slti rt, rs, immediate the 16-bit immediate is sign-extended and compared to the contents of register rt treating both operands as signed integers. if rs is less than the immediate , 1 is stored in register rt ; otherwise 0 is stored in register rt . set on less than immediate unsigned sltiu rt, rs, immediate the 16-bit immediate is sign-extended and compared to the contents of register rt treating both operands as unsigned integers. if rs is less than the immediate , 1 is stored in register rt ; otherwise 0 is stored in register rt . and immediate andi rt, rs, immediate the 16-bit immediate is zero-extended and anded with the contents of the register rs . the result is stored in register rt . or immediate ori rt, rs, immediate the 16-bit immediate is zero-extended and ored with the contents of the register rs . the result is stored in register rt . exclusive or immediate xori rt, rs, immediate the 16-bit immediate is zero-extended and ex-ored with the contents of the register rs . the result is stored in register rt . load upper immediate lui rt, immediate the 16-bit immediate is shifted left by 16 bits to set the lower 16 bits of word to 0. the result is stored in register rt . in the 64-bit mode, it is sign extended. op rs rt immediate
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 64 table 3-15. alu immediate instructions (extended isa) instruction format and description doubleword add immediate daddi rt, rs, immediate the 16-bit immediate is sign-extended to 64 bits and added to the contents of register rs . the 64-bit result is stored in register rt . an exception occurs on the generation of integer overflow. doubleword add immediate unsigned daddiu rt, rs, immediate the 16-bit immediate is sign-extended to 64 bits and added to the contents of register rs . the 64-bit result is stored in register rt . no exception occurs on the generation of overflow. table 3-16. three-operand type instructions instruction format and description add add rd, rs, rt the contents of registers rs and rt are added. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. an exception occurs on the generation of integer overflow. add unsigned addu rd, rs, rt the contents of registers rs and rt are added. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. no exception occurs on the generation of integer overflow. subtract sub rd, rs, rt the contents of register rt are subtracted from the contents of register rs . the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. an exception occurs on the generation of integer overflow. subtract unsigned subu rd, rs, rt the contents of register rt are subtracted from the contents of register rs . the 32-bit result is stored in register rd. in the 64-bit mode, it is sign-extended. no exception occurs on the generation of integer overflow. set on less than slt rd, rs, rt the contents of registers rs and rt are compared, treating both operands as signed integers. if the contents of register rs are less than those of register rt , 1 is stored in register rd ; otherwise 0 is stored in register rd . set on less than unsigned sltu rd, rs, rt the contents of registers rs and rt are compared, treating both operands as unsigned integers. if the contents of register rs are less than those of register rt , 1 is stored in register rd ; otherwise 0 is stored in register rd . and and rd, rt, rs the contents of register rs are anded with those of general-purpose register rt bit-wise. the result is stored in register rd . or or rd, rt, rs the contents of register rs are ored with those of general-purpose register rt bit-wise. the result is stored in register rd . exclusive or xor rd, rt, rs the contents of register rs are ex-ored with those of general-purpose register rt bit-wise. the result is stored in register rd . nor nor rd, rt, rs the contents of register rs are nored with those of general-purpose register rt bit-wise. the result is stored in register rd . op rs rt immediate op rs rt funct r d sa
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 65 table 3-17. three-operand type instructions (extended isa) instruction format and description doubleword add dadd rd, rt, rs the contents of register rs and register rt are added. the 64-bit result is stored in register rd . an exception occurs on the generation of integer overflow. doubleword add unsigned daddu rd, rt, rs the contents of register rs and register rt are added. the 64-bit result is stored in register rd . no exception occurs on the generation of integer overflow. doubleword subtract dsub rd, rt, rs the contents of register rt are subtracted from those of register rs . the 64-bit result is stored in register rd. an exception occurs on the generation of integer overflow. doubleword subtract unsigned dsubu rd, rt, rs the contents of register rt are subtracted from those of register rs . the 64-bit result is stored in register rd . no exception occurs on the generation of integer overflow. instruction format and description move conditional on not zero movn rd, rs, rt the contents of register rs are stored in register rd if register rt is not equal to 0. move conditional on zero movz rd, rs, rt the contents of register rs are stored in register rd if register rt is equal to 0. table 3-18. shift instructions instruction format and description shift left logical sll rd, rs, sa the contents of register rt are shifted left by sa bits and zeros are inserted into the lower bits. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. shift right logical srl rd, rs, sa the contents of register rt are shifted right by sa bits and zeros are inserted into the higher bits. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. shift right arithmetic sra rd, rt, sa the contents of register rt are shifted right by sa bits and the higher bits are sign-extended. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. shift left logical variable sllv rd, rt, rs the contents of register rt are shifted left and zeros are inserted into the lower bits. the number of bits shifted is specified by the lower 5 bits of register rs . the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. shift right logical variable srlv rd, rt, rs the contents of register rt are shifted right and zeros are inserted into the higher bits. the number of bits shifted is specified by the lower 5 bits of register rs . the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. shift right arithmetic variable srav rd, rt, rs the contents of register rt are shifted right and the higher bits are sign-extended. the number of bits shifted is specified by the lower 5 bits of register rs . the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. op rs rt funct r d sa op rs rt funct r d sa special rs rt funct r d sa
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 66 table 3-19. shift instructions (extended isa) instruction format and description doubleword shift left logical dsll rd, rt, sa the contents of register rt are shifted left by sa bits and zeros are inserted into the lower bits. the 64-bit result is stored in register rd . doubleword shift right logical dsrl rd, rt, sa the contents of register rt are shifted right by sa bits and zeros are inserted into the higher bits. the 64-bit result is stored in register rd . doubleword shift right arithmetic dsra rd, rt, sa the contents of register rt are shifted right by sa bits and the higher bits are sign-extended. the 64-bit result is stored in register rd . doubleword shift left logical variable dsllv rd, rt, rs the contents of register rt are shifted left and zeros are inserted into the lower bits. the number of bits shifted is specified by the lower 6 bits of register rs . the 64-bit result is stored in register rd . doubleword shift right logical variable dsrlv rd, rt, rs the contents of register rt are shifted right and zeros are inserted into the higher bits. the number of bits shifted is specified by the lower 6 bits of register rs . the 64-bit result is stored in register rd . doubleword shift right arithmetic variable dsrav rd, rt, rs the contents of register rt are shifted right and the higher bits are sign-extended. the number of bits shifted is specified by the lower 6 bits of register rs . the 64-bit result is stored in register rd . doubleword shift left logical + 32 dsll32 rd, rt, sa the contents of register rt are shifted left by 32 + sa bits and zeros are inserted into the lower bits. the 64-bit result is stored in register rd . doubleword shift right logical + 32 dsrl32 rd, rt, sa the contents of register rt are shifted right by 32 + sa bits and zeros are inserted into the higher bits. the 64-bit result is stored in register rd . doubleword shift right arithmetic + 32 dsra32 rd, rt, sa the contents of register rt are shifted right by 32 + sa bits and the higher bits are sign-extended. the 64-bit result is stored in register rd . op rs rt funct r d sa
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 67 table 3-20. rotate instructions (for v r 5500) instruction format and description rotate right ror rd, rt, sa the contents of register rt are shifted right by sa bits and the lower bits shifted out are inserted into the higher bits. the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. rotate right variable rorv rd, rt, rs the contents of register rt are shifted right and the lower bits shifted out are inserted into the higher bits. the number of bits shifted is specified by the lower 5 bits of register rs . the 32-bit result is stored in register rd . in the 64-bit mode, it is sign-extended. doubleword rotate right dror rd, rt, sa the contents of register rt are shifted right by sa bits and the lower bits shifted out are inserted into the higher bits. the 64-bit result is stored in register rd . doubleword rotate right + 32 dror32 rd, rt, sa the contents of register rt are shifted right by 32 + sa bits and the lower bits shifted out are inserted into the higher bits. the 64-bit result is stored in register rd . doubleword rotate right variable drorv rd, rt, rs the contents of register rt are shifted right and the lower bits shifted out are inserted into the higher bits. the number of bits shifted is specified by the lower 5 bits of register rs . the 64-bit result is stored in register rd . special rs rt funct r d sa
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 68 table 3-21. multiply/divide instructions instruction format and description multiply mult rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers. the 64-bit result is stored in special registers hi and lo. in the 64-bit mode, it is sign-extended. multiply unsigned multu rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers. the 64-bit result is stored in special registers hi and lo. in the 64-bit mode, it is sign-extended. divide div rs, rt the contents of register rs are divided by those of register rt , treating both operands as 32-bit signed integers. the 32-bit quotient is stored in special register lo, and the 32-bit remainder is stored in special register hi. in the 64-bit mode, it is sign-extended. divide unsigned divu rs, rt the contents of register rs are divided by those of register rt , treating both operands as 32-bit unsigned integers. the 32-bit quotient is stored in special register lo, and the 32-bit remainder is stored in special register hi. in the 64-bit mode, it is sign-extended. move from hi mfhi rd the contents of special register hi are loaded to register rd . move from lo mflo rd the contents of special register lo are loaded to register rd . move to hi mthi rs the contents of register rs are loaded to special register hi. move to lo mtlo rs the contents of register rs are loaded to special register lo. table 3-22. multiply/divide instructions (extended isa) instruction format and description doubleword multiply dmult rs, rt the contents of registers rs and rt are multiplied, treating both operands as signed integers. the 128-bit result is stored in special registers hi and lo. doubleword multiply unsigned dmultu rs, rt the contents of registers rs and rt are multiplied, treating both operands as unsigned integers. the 128-bit result is stored in special registers hi and lo. doubleword divide ddiv rs, rt the contents of register rs are divided by those of register rt , treating both operands as signed integers. the 64-bit quotient is stored in special register lo, and the 64-bit remainder is stored in special register hi. doubleword divide unsigned ddivu rs, rt the contents of register rs are divided by those of register rt , treating both operands as unsigned integers. the 64-bit quotient is stored in special register lo, and the 64-bit remainder is stored in special register hi. op rs rt funct r d sa op rs rt funct r d sa
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 69 table 3-23. macc instructions (for v r 5500) (1/2) instruction format and description multiply, accumulate, and move lo macc rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and result is added to a value that combines the lower 32 bits of special registers hi and lo. the lower 32 bits of the result are stored in register rd . unsigned multiply, accumulate, and move lo maccu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and result is added to a value that combines the lower 32 bits of special registers hi and lo. the lower 32 bits of the result are stored in register rd . multiply, accumulate, and move hi macchi rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and result is added to a value that combines the lower 32 bits of special registers hi and lo. the higher 32 bits of the result are stored in register rd . unsigned multiply, accumulate, and move hi macchiu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and result is added to a value that combines the lower 32 bits of special registers hi and lo. the higher 32 bits of the result are stored in register rd . multiply, negate, accumulate, and move lo msac rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the lower 32 bits of the result are stored in register rd . unsigned multiply, negate, accumulate, and move lo msacu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the lower 32 bits of the result are stored in register rd . multiply, negate, accumulate, and move hi msachi rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the higher 32 bits of the result are stored in register rd . unsigned multiply, negate, accumulate, and move hi msachiu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the higher 32 bits of the result are stored in register rd . multiply and move lo mul rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers. the higher 32 bits of the result is stored in the lower bits of special register hi, and lower 32 bits of the result are stored in lower bits of special register lo and register rd . unsigned multiply and move lo mulu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers. the higher 32 bits of the result is stored in the lower bits of special register hi, and lower 32 bits of the result are stored in lower bits of special register lo and register rd . special rs rt funct r d
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 70 table 3-23. macc instructions (for v r 5500) (2/2) instruction format and description multiply and move hi mulhi rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers. the higher 32 bits of the result are stored in the lower bits of special register hi and register rd , and the lower 32 bits of the result are stored in the lower bits of special register lo. unsigned multiply and move hi mulhiu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers. the higher 32 bits of the result are stored in the lower bits of special register hi and register rd , and the lower 32 bits of the result are stored in the lower bits of special register lo. multiply, negate, and move lo muls rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and the result is inverted. the higher 32 bits of the result are stored in the lower bits of special register hi, and the lower 32 bits of the result are stored in the lower bits of special register lo and register rd . unsigned multiply, negate, and move lo mulsu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, the result is inverted. the higher 32 bits of the result are stored in the lower bits of special register hi, and the lower 32 bits of the result are stored in the lower bits of special register lo and register rd . multiply, negate, and move hi mulshi rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, the result is inverted. the higher 32 bits of the result are stored in the lower bits of special register hi and register rd , and the lower 32 bits of the result are stored in the lower bits of special register lo. unsigned multiply, negate, and move hi mulshiu rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, the result is inverted. the higher 32 bits of the result are stored in the lower bits of special register hi and register rd , and the lower 32 bits of the result are stored in the lower bits of special register lo. special rs rt funct r d
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 71 table 3-24. sum-of-products instructions (for v r 5500) instruction format and description multiply and add word madd rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and the result is added to a value that combines the lower 32 bits of special registers hi and lo. the 64-bit result is stored in special registers hi and lo. multiply and add word unsigned maddu rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and the result is added to a value that combines the lower 32 bits of special registers hi and lo. the 64-bit result is stored in special registers hi and lo. multiply and subtract word msub rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers, and the result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the 64-bit result is stored in special registers hi and lo. multiply and subtract word unsigned msubu rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit unsigned integers, and the result is subtracted from a value that combines the lower 32 bits of special registers hi and lo. the 64-bit result is stored in special registers hi and lo. multiply mul64 rd, rs, rt the contents of registers rs and rt are multiplied, treating both operands as 32-bit signed integers. the lower 32 bits of the result are stored in register rd . since the v r 5500 stalls the entire pipeline when executing an integer multiply/divide instruction, the number of execution cycle increases compared with normal instruction execution. the number of processor cycles (pcycles) required for an integer multiply/divide instruction is shown below. table 3-25. number of cycles for multiply and divide instructions number of pcycles instruction when executed singly when executed repeatedly div, divu 40 40 ddiv, ddivu 72 72 macc, macchi, macchiu, maccu, msac, msachi, msachiu, msacu 3 3 mul, mulhi, mulhiu, mulu, muls, mulshi, mulshiu, mulsu 3 3 madd, maddu, msub, msubu 2 2 mul64 22 mult, multu 33 dmult, dmultu 3 3 special2 rs rt funct r d 0
chapter 3 outline of instruction set preliminary user ? s manual u16044ej1v0um 72 table 3-26. register scan instructions (for v r 5500) instruction format and description count leading ones clo rd, rs the 32-bit contents of register rs are scanned from the highest to lowest bit, and the number of 1s is stored in register rd . count leading zeros clz rd, rs the 32-bit contents of register rs are scanned from the highest to lowest bit, and the number of 0s is stored in register rd . count leading ones in doubleword dclo rd, rs the 64-bit contents of register rs are scanned from the highest to lowest bit, and the number of 1s is stored in register rd . count leading zeros in doubleword dclz rd, rs the 64-bit contents of register rs are scanned from the highest to lowest bit, and the number of 0s is stored in register rd . 3.3.3 jump and branch instructions jump and branch instructions change the control flow of a program. all jump and branch instructions occur with a delay of one instruction: that is, the instruction immediately following the jump or branch instruction (this is known as the instruction in the delay slot) always executes while the target instruction is being fetched from memory. for instructions involving a link (such as jal and bltzal), the return address is saved in register r31. table 3-27. jump instruction instruction format and description jump j target the contents of the 26-bit target address is shifted left by two bits and combined with the higher 4 bits of the pc. the program jumps to this calculated address with a delay of one instruction. jump and link jal target the contents of the 26-bit target address is shifted left by two bits and combined with the higher 4 bits of the pc. the program jumps to this calculated address with a delay of one instruction. the address of the instruction following the delay slot is stored in r31 (link register). instruction format and description jump register jr rs the program jumps to the address specified in register rs with a delay of one instruction. jump and link register jalr rs, rd the program jumps to the address specified in register rs with a delay of one instruction. the address of the instruction following the delay slot is stored in rd . op rs rt funct rd sa op target special2 rs rt funct r d 0
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 73 table 3-28. branch instructions instruction format and description branch on equal beq rs, rt, offset if the contents of register rs are equal to those of register rt , the program branches to the target address. branch on not equal bne rs, rt, offset if the contents of register rs are not equal to those of register rt , the program branches to the target address. branch on less than or equal to zero blez rs, offset if the contents of register rs are less than or equal to zero, the program branches to the target address. branch on greater than zero bgtz rs, offset if the contents of register rs are greater than zero, the program branches to the target address. instruction format and description branch on less than zero bltz rs, offset if the contents of register rs are less than zero, the program branches to the target address. branch on greater than or equal to zero bgez rs, offset if the contents of register rs are greater than or equal to zero, the program branches to the target address. branch on less than zero and link bltzal rs, offset the address of the instruction that follows delay slot is stored in register r31 (link register). if the contents of register rs are less than zero, the program branches to the target address. branch on greater than or equal to zero and link bgezal rs, offset the address of the instruction that follows delay slot is stored in register r31 (link register). if the contents of register rs are greater than or equal to zero, the program branches to the target address. remark sub: sub-operation code instruction format and description branch on coprocessor 0 true bc0t offset the 16-bit offset (shifted left by two bits and sign-extended) is added to the address of the instruction in the delay slot to calculate the branch target address. if the conditional signal of the coprocessor 0 is true, the program branches to the target address with one-instruction delay. branch on coprocessor 0 false bc0f offset the 16-bit offset (shifted left by two bits and sign-extended) is added to the address of the instruction in the delay slot to calculate the branch target address. if the conditional signal of the coprocessor 0 is false, the program branches to the target address with one-instruction delay. remark bc: bc sub-operation code br: branch condition identifier op rs rt offset regimm offset rs sub cop0 offset bc br
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 74 table 3-29. branch instructions (extended isa) instruction format and description branch on equal likely beql rs, rt, offset if the contents of register rs are equal to those of register rt , the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on not equal likely bnel rs, rt, offset if the contents of register rs are not equal to those of register rt , the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on less than or equal to zero likely blezl rs, offset if the contents of register rs are less than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than zero likely bgtzl rs, offset if the contents of register rs are greater than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. instruction format and description branch on less than zero likely bltzl rs, offset if the contents of register rs are less than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than or equal to zero likely bgezl rs, offset if the contents of register rs are greater than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on less than zero and link likely bltzall rs, offset the address of the instruction that follows delay slot is stored in register r31 (link register). if the contents of register rs are less than zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. branch on greater than or equal to zero and link likely bgezall rs, offset the address of the instruction that follows delay slot is stored in register r31 (link register). if the contents of register rs are greater than or equal to zero, the program branches to the target address. if the branch condition is not met, the instruction in the delay slot is discarded. remark sub: sub-operation code instruction format and description branch on coprocessor 0 true likely bc0tl offset the 16-bit offset (shifted left by two bits and sign-extended) is added to the address of the instruction in the delay slot to calculate the branch target address. if the conditional signal of the coprocessor 0 is true, the program branches to the target address with one-instruction delay. if the branch condition is not met, the instruction in the delay slot is discarded. branch on coprocessor 0 false likely bc0fl offset the 16-bit offset (shifted left by two bits and sign-extended) is added to the address of the instruction in the delay slot to calculate the branch target address. if the conditional signal of the coprocessor 0 is false, the program branches to the target address with one-instruction delay. if the branch condition is not met, the instruction in the delay slot is discarded. remark bc: bc sub-operation code br: branch condition identifier op rs rt offset regimm offset rs sub cop0 offset bc br
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 75 3.3.4 special instructions special instructions mainly generate software exceptions. table 3-30. special instructions instruction format and description synchronize sync completes the load/store instruction executing in the current pipeline before the next load/store instruction starts execution. system call syscall generates a system call exception, and then transits control to the exception handling program. breakpoint break generates a break point exception, and then transits control to the exception handling program. table 3-31. special instructions (extended isa) (1/2) instruction format and description trap if greater than or equal tge rs, rt the contents of register rs are compared with those of register rt , treating both operands as signed integers. if the contents of register rs are greater than or equal to those of register rt , an exception occurs. trap if greater than or equal unsigned tgeu rs, rt the contents of register rs are compared with those of register rt , treating both operands as unsigned integers. if the contents of register rs are greater than or equal to those of register rt , an exception occurs. trap if less than tlt rs, rt the contents of register rs are compared with those of register rt , treating both operands as signed integers. if the contents of register rs are less than those of register rt , an exception occurs. trap if less than unsigned tltu rs, rt the contents of register rs are compared with those of register rt , treating both operands as unsigned integers. if the contents of register rs are less than those of register rt , an exception occurs. trap if equal teq rs, rt if the contents of registers rs and rt are equal, an exception occurs. trap if not equal tne rs, rt if the contents of registers rs and rt are not equal, an exception occurs. special rs rt funct r d sa special rs rt funct r d sa
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 76 table 3-31. special instructions (extended isa) (2/2) instruction format and description trap if greater than or equal immediate tgei rs, immediate the contents of register rs are compared with 16-bit sign-extended immediate data, treating both operands as signed integers. if the contents of register rs are greater than or equal to 16-bit sign- extended immediate data, an exception occurs. trap if greater than or equal immediate unsigned tgeiu rs, immediate the contents of register rs are compared with 16-bit zero-extended immediate data, treating both operands as unsigned integers. if the contents of register rs are greater than or equal to 16-bit sign- extended immediate data, an exception occurs. trap if less than immediate tlti rs, immediate the contents of register rs are compared with 16-bit sign-extended immediate data, treating both operands as signed integers. if the contents of register rs are less than 16-bit sign-extended immediate data, an exception occurs. trap if less than immediate unsigned tltiu rs, immediate the contents of register rs are compared with 16-bit zero-extended immediate data, treating both operands as unsigned integers. if the contents of register rs are less than 16-bit sign-extended immediate data, an exception occurs. trap if equal immediate teqi rs, immediate if the contents of register rs and immediate data are equal, an exception occurs. trap if not equal immediate tnei rs, immediate if the contents of register rs and immediate data are not equal, an exception occurs. remark sub: sub-operation code instruction format and description prefetch pref hint, offset (base) sign-extends a 16-bit offset and adds it to register base to generate a virtual address. the operation to be performed on that address is indicated by 5-bit hint . table 3-32. special instructions (for v r 5500) instruction format and description superscalar nop ssnop the processor waits until all preceding instructions have been committed or until writeback to a register by the preceding load instruction has been completed. regimm immediate rs sub op b ase hi nt offset special rs r d funct rt sa
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 77 3.3.5 coprocessor instructions the coprocessor instructions perform the operations of each coprocessor. the coprocessor load and store instructions are i-type instructions. the format of the operation instructions of the coprocessor differs depending on the coprocessor. table 3-33. coprocessor instructions instruction format and description load word to coprocessor z lwcz rt, offset (base) sign-extends an offset and adds it to register base to generate an address. loads the contents of a word specified by the address to general-purpose register rt of coprocessor z. store word from coprocessor z swcz rt, offset (base) sign-extends an offset and adds it to register base to generate an address. stores the contents of general-purpose register rt of coprocessor z in the memory location specified by the address. instruction format and description move to coprocessor z mtcz rt, rd transfers the contents of cpu register rt to register rd of coprocessor z. move from coprocessor z mfcz rt, rd transfers the contents of register rd of coprocessor z to cpu register rt . move control to coprocessor z ctcz rt, rd transfers the contents of cpu register rt to coprocessor control register rd of coprocessor z. move control from coprocessor z cfcz rt, rd transfers the contents of coprocessor control register rd of coprocessor z to cpu register rt . remark sub: sub-operation code instruction format and description coprocessor z operation copz cofun coprocessor z executes the operation defined for each coprocessor. the cpu status is not affected by the operation of the coprocessor. remark co: sub-operation identifier op b ase rt offset cop zsu b rt 0 r d cop z co cofun
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 78 table 3-34. coprocessor instructions (extended isa) instruction format and description doubleword move to coprocessor z dmtcz rt, rd transfers the contents of general-purpose register rt of the cpu to register rd of coprocessor z. doubleword move from coprocessor z dmfcz rt, rd transfers the contents of register rd of coprocessor z to general-purpose register rt of the cpu. remark sub: sub-operation code instruction format and description load doubleword to coprocessor z ldcz rt, offset (base) sign-extends an offset and adds it to register base to generate an address. loads the contents of the doubleword specified by the address to a general-purpose register ( rt if fr = 1, or rt and rt + 1 if fr = 0) of coprocessor z. store doubleword from coprocessor z sdcz rt, offset (base) sign-extends an offset and adds it to register base to generate an address. stores the contents of the doubleword of a general-purpose register ( rt if fr = 1, or rt and rt + 1 if fr = 0) of coprocessor z in the memory location specified by the address. 3.3.6 system control coprocessor (cp0) instructions system control coprocessor (cp0) instructions perform operations specifically on the cp0 registers to manipulate the memory management and exception handling facilities of the processor. table 3-35. system control coprocessor (cp0) instructions (1/2) instruction format and description move to system control coprocessor mtc0 rt, rd the word data of general register rt in the cpu are loaded to general register rd in the cp0. move from system control coprocessor mfc0 rt, rd the word data of general register rd in the cp0 are loaded to general register rt in the cpu. doubleword move to system control coprocessor 0 dmtc0 rt, rd the doubleword data of general register rt in the cpu are loaded to general register rd in the cp0. doubleword move from system control coprocessor 0 dmfc0 rt, rd the doubleword data of general register rd in the cp0 are loaded to general register rt in the cpu. remark sub: sub-operation code cop0 su b rt 0 r d op b ase rt offset cop zsu b rt 0 r d
chapter 3 outline of instruction set preliminary user?s manual u16044ej1v0um 79 table 3-35. system control coprocessor (cp0) instructions (2/2) instruction format and description read indexed tlb entry tlbr the tlb entry indexed by the index register is loaded to the entryhi, entrylo0, entrylo1, or pagemask register. write indexed tlb entry tlbwi the contents of the entryhi, entrylo0, entrylo1, or pagemask register are loaded to the tlb entry indexed by the index register. write random tlb entry tlbwr the contents of the entryhi, entrylo0, entrylo1, or pagemask register are loaded to the tlb entry indexed by the random register. probe tlb for matching entry tlbp the address of the tlb entry that matches the contents of entryhi register is loaded to the index register. return from exception eret the program returns from exception, interrupt, or error trap. remark co: sub-operation identifier instruction format and description cache operation cache op, offset (base) sign-extends the 16-bit offset and adds to the contents of register base to generate a virtual address. this virtual address is translated to physical address with tlb. for this physical address, cache operation that is indicated by 5-bit op is performed. table 3-36. system control coprocessor (cp0) instructions (for v r 5500) instruction format and description wait wait the processor?s operating mode is shifted to standby mode. remark co: sub-operation identifier instruction format and description move to performance counter mtpc rt, reg the contents of general-purpose register rt in the cpu are loaded to performance counter reg in the cp0. move from performance counter mfpc rt, reg the contents of performance counter reg in the cp0 are loaded to general-purpose register rt in the cpu. move to performance event specifier mtps rt, reg the contents of general-purpose register rt in the cpu are loaded to performance counter control register reg in the cp0. move from performance event specifier mfps rt, reg the contents of performance counter control register reg in the cp0 are loaded to general-purpose register rt in the cpu. remark sub: sub-operation code cop0 funct co cache offset base op cop0 code co funct cop0 su b rt 0 r d reg
preliminary user?s manual u16044ej1v0um 80 chapter 4 pipeline this chapter explains the pipeline. 4.1 overview the pipeline is one of the instruction execution formats. it divides instruction execution processing into several stages. an instruction has been completely executed when it has gone through all the stages. when one instruction has been processed in one stage, the next instruction enters that stage. the operating clock of the pipeline is called pclock, and one of its cycles is called pcycle. each stage of the pipeline is executed in 1 pcycle. the pipeline of the v r 5500 has a two-way superscalar architecture in which two instructions are fetched at a time. the instructions are executed in the pipeline out of order. if the pipeline is completely filled, execution of two instructions can be completed in 1pcycle.
chapter 4 pipeline preliminary user?s manual u16044ej1v0um 81 4.1.1 pipeline stages the v r 5500 has six execution units including integer operation, floating-point operation (including sum-of- products operation), load/store, and branch units. each of these units operates independently. therefore, the number of stages of the pipeline differs depending on the instruction. for example, an integer arithmetic operation instruction uses nine stages. the stages that make up the pipeline include the following. if: instruction fetch ex: execution br: branch prediction df: data fetch iq: instruction queue al: data align rn: register renaming wb: writeback rs: reservation station cor: commit register rf: register fetch com: commit memory figure 4-1. pipeline stages of v r 5500 and instruction flow ex wb if fetch pipeline alu0 (integer) alu1 (integer) lsu (load/store) bru (jr/branch) fpu (floating-point) iq rn rs rf br ex wb ex df al ex1 ex2 ex1 ex2 wb ex1 ex2 wb cor renaming & dispatch pipeline execution pipeline commit pipeline com if iq instruction queue rn rs reservation station renaming register rf cor com cor com fpu/macu (floating-point/multiplication/division)
chapter 4 pipeline preliminary user ? s manual u16044ej1v0um 82 4.1.2 configuration of pipeline the pipeline of the v r 5500 is divided into four blocks. each block operates independently. (1) fetch pipeline the fetch pipeline generates a speculative fetch stream in accordance with branch prediction and stores a fetched instruction in a 16-entry instruction queue. it can fetch two instructions per cycle from the 64-bit bus connected to the instruction cache. if the fetched instruction includes a branch or jump instruction, the fetch pipeline immediately calculates the address at the destination by using a branch history table and information on the return address stack, and changes the program flow. as a result, all processing is speculatively issued. even if the execution pipeline does not execute a branch instruction, therefore, the fetch pipeline continues processing a branch instruction and tracing an instruction stream without stalling, until the instruction queue becomes full. (2) renaming & dispatch pipeline the renaming & dispatch pipeline can receive up to two instructions from the instruction queue per cycle, and assign a renaming register number to the received instructions. at the same time, it overwrites the register number specified as an operand with a renaming register number. the renamed instructions are stored in the reservation station (rs). the v r 5500 has an rs dedicated to each execution unit. four entries each are available for the two alus, four entries for lsu, four entries for bru, two entries for fpu, and two entries for fpu/macu. this pipeline continues operating until the instruction queue becomes empty or the rs becomes full. each instruction stored in the rs is checked for its dependency upon other instructions and the utilization status of the execution unit necessary for execution. an instruction that has been judged as executable is selected from the rs. up to two instructions can be selected per cycle. the instruction sequence described in the program is ignored. the two selected instructions are packed into one instruction, like vliw. the packed instructions are sent to the execution pipeline.
chapter 4 pipeline preliminary user ? s manual u16044ej1v0um 83 the types of instructions that can be packed are shown below. figure 4-2. combination of instructions that can be packed int higher-side instruction lower-side instruction higher-side instruction lower-side instruction higher-side instruction lower-side instruction int int br int fp mem int mem br mem fp fp int fp br fp fp mac int mac br mac fp fp nop int nop mem nop nop br nop mac remark int: integer operation br: branch fp: floating-point operation mem: load/store (memory access) mac: sum-of-products operation, multiplication/division nop: no operation (3) execution pipeline the execution pipeline consists of six execution units. the higher side of the packed instructions is sent to the lsu, alu0, and fpu/macu, and is executed by one of these units. the lower side is sent to the fpu/macu, alu1, bru, and fpu, and is executed by one of them. the fpu/macu and fpu execute floating-point operations. the fpu/macu is a fpu with a multiplier/divider added, and can also execute integer multiplication/division. all the execution results are stored in the renaming register assigned to the instruction along with exception information that has been detected. instructions do not stall in the execution pipeline of the v r 5500. all dependency relationships and resource conflicts are resolved by the renaming & dispatch pipeline before the execution pipeline. therefore, the execution pipeline of the v r 5500 is not provided with a mechanism for stall detection.
chapter 4 pipeline preliminary user ? s manual u16044ej1v0um 84 figure 4-3. instruction flow in execution pipeline alu0 alu1 lsu bru fpu fpu/macu higher-side instruction lower-side instruction packed instruction rs rs rs rs rs rs instruction (4) commit pipeline the commit pipeline controls the processor state. the instructions that are executed by the execution pipeline regardless of the program sequence are completed (committed) in the program sequence by this pipeline. the commit pipeline performs the following processing. ? checking of exception/trap ? updating store buffer ? updating processor state
chapter 4 pipeline preliminary user?s manual u16044ej1v0um 85 4.2 branch delay the position of the instruction next to a branch instruction is called the branch delay slot. the instruction in the branch delay slot is executed regardless of whether the condition of the branch instruction (except the branch likely instruction) is satisfied or not. to accelerate branch processing, the v r 5500 has a branch prediction mechanism. this mechanism uses a branch history table (bht) with 4096 entries (2 bits each) to record satisfaction of the condition of branch instructions executed in the past. it also uses a return address stack (ras) to hold the address to which execution is to return after a function call. the v r 5500 predicts the target address of a branch instruction in accordance with the bht, and speculatively fetches and executes the subsequent instructions. the pipeline of the v r 5500 generates a branch delay of six cycles if branch prediction is wrong. if branch prediction is correct, the branch delay is 1 cycle. figure 4-4 shows how branch prediction is performed and the position of the branch delay slot. figure 4-4. branch delay (a) if branch prediction is correct branch for which prediction is correct branch delay slot target branch delay note if br, iq rn rs rf ex wb cor if br, iq rn rs rf ex wb cor if br, iq rn rs rf ex wb cor com com com (b) if branch prediction is wrong branch for which prediction is wrong branch delay slot target branch delay note if br, iq rn rs rf ex wb cor if br, iq rn rs rf ex wb cor if br, iq rn rs rf ex wb cor com com com note the branch delay is covered if there is a valid instruction in the instruction queue.
chapter 4 pipeline preliminary user ? s manual u16044ej1v0um 86 4.3 load delay the load delay instruction generates a delay until the subsequent instruction can use the result of loading. the processor performs the scheduling necessary for eliminating this delay. because the v r 5500 uses an out-of-order mechanism to execute instructions, the delay can be covered by executing an instruction that is not dependent upon the load instruction even if a load delay occurs. figure 4-5. load delay rf ex dispatch data transfer rf ex df al rf ex dispatch rf ex rf ex dispatch rf ex rf ex dispatch rf ex add lw 4.3.1 non-blocking load to alleviate the penalty due to a cache miss, the data cache of the v r 5500 has a non-blocking mechanism. this allows the v r 5500 to continue accessing the cache while holding a cache miss, even if a cache miss occurs as a result of executing a load instruction. this means that the subsequent instructions, including other load instructions, can be consecutively executed if they do not have dependency relationship with the load instruction that has caused the cache miss. up to four cache misses can be held.
chapter 4 pipeline preliminary user ? s manual u16044ej1v0um 87 4.4 exception processing if an exception occurs, the instruction that has caused the exception and all the subsequent instructions in the pipeline are canceled. if the instruction responsible for the exception has reached the commit stage, the following three events occur. ? the status and cause of the exception are written to each cp0 register. ? the current pc changes to an appropriate exception vector address. ? the previous exception bit is cleared. as a result, all the instructions that had been issued before the exception occurred are completed, and all the instructions issued after the instruction responsible for the exception are discarded. therefore, the epc indicates the value from which execution can be resumed. figure 4-6 shows an example of detecting an exception. figure 4-6. exception detection exception detected all instructions are aborted. instruction at exception vector executed if br, iq rn rs rf ex wb cor if br, iq rn rs rf ex wb if br, iq rn rs rf ex if br, iq rn rs rf if br, iq rn rs rf ex wb cor com cor wb ex com 4.5 store buffer the v r 5500 has a 4-entry store buffer (sb) in the dcu so that it can speculatively execute store instructions. the sb temporarily holds the store data of a speculatively executed store instruction, and actually writes data to the cache when that store instruction is committed. 4.6 write transaction buffer the v r 5500 has a write transaction buffer (wtb) that improves the performance of write operations to the external memory. the wtb is used for all transactions of the system interface. the wtb is a four-stage fifo and can hold data of up to 256 bits. it can therefore hold up to four read requests or one uncached write request or cache line writeback. the entire wtb is used for writeback data in case of a cache miss that requires writeback, and the processor can perform processing in parallel with memory updating. in the case of storing in an uncached area and a write-through store, processing by the wtb and writing to the memory by the cpu are not executed in parallel. if the wtb is full, the subsequent store operation is stalled until there is a space available. the wtb cannot be read or written by software.
preliminary user?s manual u16044ej1v0um 88 chapter 5 memory management system the v r 5500 has a memory management unit (mmu) that uses a high-speed translation lookaside buffer (tlb) which translates virtual addresses into physical addresses. this chapter explains in detail the operation of the tlb, the cp0 registers used as a software interface with the tlb, and the memory mapping method used to translate virtual addresses into physical addresses. 5.1 processor modes 5.1.1 operating modes the v r 5500 has the following three operating modes with priority assigned by the system to these modes, starting with the one at the top. ? kernel mode (highest priority): in this mode, all the registers can be accessed and changed. the nucleus of the operating system operates in the kernel mode. ? supervisor mode: the priority of this mode is lower than that of the kernel mode. this mode is used for sections assigned a lower importance by the operating system. ? user mode (lowest priority): this mode prevents users from interfering with each other. the basic operating mode of the processor is the user mode. when the processor processes an error (when the erl bit is set) or an exception (when the exl bit is set), it enters the kernel mode. the operating mode of the processor is set by the ksu field of the status register and the erl and exl bits. table 5-1 shows the three operating modes, and the setting of the status register related to the error and exception levels. a blank indicates that any setting is possible. table 5-1. operating modes status register bit ksu(1:0) exl erl operating mode 10 0 0 user mode 01 0 0 supervisor mode 00 0 0 kernel mode 1 1 in the case of an exception or error, the exl and erl bits are set regardless of the setting of the ksu field. when these bits are set, interrupts are disabled. if the exl bit is cleared by an exception handler to enable processing of multiple interrupts, for example, the processor enters the mode set by the ksu field from the kernel mode. therefore, change the ksu field before clearing the exl bit by an exception handler.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 89 5.1.2 instruction set modes the instruction set mode of the processor determines which instructions are enabled. by default, the mips iv instruction set architecture (isa) is implemented. however, mips iii isa or mips i/ii isa can also be used to maintain compatibility with a conventional machine. the instruction set mode is set by bits ux, sx, and xx of the status register. table 5-2 shows the setting of the status register related to the instruction set mode. a blank indicates that any setting is possible. table 5-2. instruction set modes status register bit instruction set mode operating mode ux sx xx mips i, ii mips iii mips iv user mode 0 0 can be used cannot be used cannot be used 0 1 can be used cannot be used can be used 1 0 can be used can be used cannot be used 1 1 can be used can be used can be used supervisor mode 0 can be used cannot be used can be used 1 can be used can be used can be used kernel mode can be used can be used can be used 5.1.3 addressing modes the addressing mode of the processor determines whether a 32-bit or 64-bit memory address is to be generated. refer to table 5-3 for the settings of the following addressing modes. ? in the kernel mode, 64-bit addressing is enabled by the kx bit. all the instructions are always valid. ? in the supervisor mode, 64-bit addressing and the mips iii instructions are enabled by the sx bit. ? in the user mode, 64-bit addressing and the mips iii instructions are enabled by the ux bit. in addition, the mips iv instructions are enabled by the xx bit. table 5-3. addressing modes status register bit operating mode ux sx kx addressing mode user mode 0 32-bit 1 64-bit supervisor mode 0 32-bit 1 64-bit kernel mode 0 32-bit 1 64-bit
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 90 5.2 translation lookaside buffer (tlb) virtual addresses are translated into physical addresses using an on-chip tlb note . the on-chip tlb is a fully- associative memory that holds 48 entries, which provide mapping to odd/even page in pairs for one entry. these pages can have ten different sizes, 4 k, 16 k, 64 k, 256 k, 1 m, 4 m, 16 m, 64 m, 256 m, and 1 g, and can be specified for each entry. if it is supplied with a virtual address, each tlb entry checks the 48 entries simultaneously to see whether they match the virtual addresses that are provided with the asid field and saved in the entryhi register. if there is a virtual address match (hit) in the tlb, a physical address is created from the physical page number and the offset value. if no match occurs (miss), an exception is taken and software refills the tlb entry from the page table resident in memory. the software writes to an entry selected using the index register or a random entry indicated in the random register. if more than one entry in the tlb matches the virtual address being translated, the operation is undefined. in this case, the ts bit of the status register is set to 1, and a tlb refill exception occurs regardless of the valid bit status of the tlb entry. replace the tlb entry using the exception handler and clear the ts bit to 0. note depending on the address space, virtual addresses may be translated to physical addresses without using a tlb. for example, address translation for the kseg0 or kseg1 address space does not use mapping. the physical addresses of these address spaces are determined by subtracting the base address of the address space from the virtual addresses. (1) micro tlb the v r 5500 has two 4-entry micro tlbs in addition to a 48-entry tlb. these tlbs are also full-associative memories and are respectively dedicated to the translation of instruction and data addresses. the micro tlbs are a subset of the tlb, and the page size can be set for each entry in the same manner as the tlb. if a mismatch occurs in a micro tlb, the entries are replaced with new entries from the tlb by using a dummy lru (least recently used) algorithm. the pipeline stalls while an entry is being transferred from the tlb.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 91 5.2.1 format of tlb entry figure 5-1 shows the tlb entry formats for both 32- and 64-bit modes. each field of an entry has a corresponding field in the entryhi, entrylo0, entrylo1, or pagemask registers. figure 5-1. format of tlb entry (a) 32-bit addressing mode 127 96 0 0 126 mask 95 64 vpn2 asid 0 75 g 77 63 32 0 c 61 pfn 62 37 38 31 0 0 c 29 pfn 30 5 6 108 109 71 72 76 34 35 2 3 d d 33 1 v v 0 0 (b) 64-bit addressing mode 255 192 0 0 222 mask 223 128 r asid 0 g 127 64 0 c 93 pfn 94 69 70 63 0 0 c 29 pfn 30 5 6 204 205 66 67 2 3 d d 65 1 v v 0 0 189 190 191 167 168 0 140 141 135 136 vpn2 139 the format of the entryhi, entrylo0, entrylo1, and pagemask registers is almost the same as a tlb entry. however, the bit at the position corresponding to the tlb g bit is reserved (0) in the entryhi register. the bit at the position corresponding to the g bit of the entrylo register is reserved (0) in the tlb. for details of other fields, refer to the description of the relevant registers. the contents of the tlb entries can be read or written via the entryhi, entrylo0, entrylo1, and pagemask registers using a tlb manipulation instruction, as shown in figure 5-2. the target entry is either one specified by the index register, or a random entry indicated by the random register.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 92 figure 5-2. outline of tlb manipulation tlb entry selected using the index register or random register 47 pagemask entryhi entrylo1 entrylo0 0 tlb 0 127/255 5.2.2 tlb instructions the instructions used for tlb control are described below. (1) tlbp (translation lookaside buffer probe) the tlbp instruction loads the index register with a tlb entry number that matches the contents of the entryhi register. if there is no matching tlb entry, the most significant bit of the index register is set (1). (2) tlbr (translation lookaside buffer read) the tlbr instruction writes the entryhi, entrylo0, entrylo1, and pagemask registers with the contents of the tlb entry indicated by the content of the index register. (3) tlbwi (translation lookaside buffer write index) the tlbwi instruction writes the contents of the entryhi, entrylo0, entrylo1, and pagemask registers to the tlb entry indicated by the contents of the index register. (4) tlbwr (translation lookaside buffer write random) the tlbwr instruction writes the contents of the entryhi, entrylo0, entrylo1, and pagemask registers to the tlb entry indicated by the contents of the random register. 5.2.3 tlb exception if there is no tlb entry that matches the virtual address, a tlb refill exception occurs. if the access control bits (d and v) indicate that the access is not valid, a tlb modified or tlb invalid exception occurs. refer to chapter 6 exception processing for details of tlb exceptions.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 93 5.3 virtual-to-physical address translation translating a virtual address to a physical address begins by comparing the virtual address sent from the processor with the virtual addresses of all entries in the tlb. first, one of the following comparisons is made for the virtual page number (vpn) of the address. ? in 32-bit mode: the higher bits note of the virtual address are compared to the contents of the vpn2 (virtual page number divided by two) of each tlb entry. ? in 64-bit mode: the higher bits note of the virtual address are compared to the contents of the r and the vpn2 (virtual page number divided by two) of each tlb entry. note the number of bits differs depending on the page size. the table below shows examples of the higher bits of the virtual address with page sizes of 16 mb and 4 kb. page size addressing mode 16 mb 4 kb 32-bit mode a(31:25) a(31:13) 64-bit mode a63, a62, a(39:25) a63, a62, a(39:13) when there is an entry which has a field with the same contents in this comparison, if either of the following applies, a match occurs. ? the global bit (g) of the tlb entry is set to 1 ? the asid field of the virtual address is the same as the asid field of the tlb entry. this match is referred to as a tlb hit. if the matching entry is in the tlb, the physical address and access control bits (c, d, v) are read out from that entry. in order to perform valid address translation, the entry ? s v bit must be set (1), but this is unrelated to the determination of the matching tlb entry. an offset value is added to the physical address that was read out. the offset indicates an address inside the page frame space. the offset part bypasses the tlb and the lower bits of the virtual address are output as are. if there is no match, the processor core generates a tlb refill exception and references the page table in the memory in which the virtual addresses and physical addresses have been paired, the contents of which are then written to the tlb via software. figure 5-3 shows a summary of address translation, and figure 5-4 the tlb address translation flowchart.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 94 figure 5-3. virtual-to-physical address translation asid vpn offset g asid vpn pfn tlb entry pfn offset tlb physical address virtual address <1> the virtual address page number (vpn, higher bits in the address) and asid are compared with the corresponding area in the tlb. <2> if there is an entry matched, the page frame number (pfn) representing the higher bits of the physical address is output from the tlb. <3> the offset is then added to the pfn, which bypasses the tlb.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 95 figure 5-4. tlb address translation no no yes yes address ok? virtual address input physical address output user mode? address error exception no yes vpn match? no yes g bit = 1? no yes v bit = 1? no yes d bit = 1? no yes uncached area? no yes no yes no yes no yes no write? no yes no yes asid match? tlb invalid exception physical address output address ok? supervisor mode? address ok? address error exception physical address output 32-bit address? tlb refill exception xtlb refill exception tlb modified exception main memory access cache access tlb not used no yes not a multi hit? ts bit of status register 1
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 96 5.3.1 32-bit addressing mode address translation figure 5-5 shows the virtual-to-physical address translation in the 32-bit mode addressing mode. the page sizes can be selected from the ten pattern, 4 kb (12 bits) to 1 gb (30 bits) in 4-multiply units. ? shown at the top of figure 5-5 is the virtual address space in which the page size is 4 kb and the offset is 12 bits. the 20 bits excluding the asid field represent the virtual page number (vpn), enabling selection of a page table of 1 m entries. ? shown at the bottom of figure 5-5 is the virtual address space in which the page size is 16 mb and the offset is 24 bits. the 8 bits excluding the asid field represent the vpn, enabling selection of a page table of 256 entries. figure 5-5. virtual address translation in 32-bit addressing mode virtual address for 4 kb page 1 m (2 20 ) 39 32 31 29 28 12 11 0 20 bits = 1 m page tlb virtual-to-physical address translation with the tlb the offset is used for the physical address without being changed. pfn offset 35 0 tlb 39 32 31 29 28 0 vpn asid offset 24 23 8 bits = 256 page vpn offset asid virtual address for 16 mb page 256 (2 8 ) the offset is used for the physical address without being changed. virtual-to-physical address translation with the tlb 36-bit physical address note note note user, supervisor, or kernel address space is selected by bits 31 to 29 of the virtual address. remark bits 35 to 32 of the physical address are not output in the 32-bit bus mode.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 97 5.3.2 64-bit addressing mode address translation figure 5-6 shows the virtual-to-physical address translation in the 64-bit mode addressing mode. the page sizes can be selected from the ten pattern, 4 kb (12 bits) to 1 gb (30 bits) in 4-multiply units. ? shown at the top of figure 5-6 is the virtual address space in which the page size is 4 kb and the offset is 12 bits. the 28 bits excluding the asid field represent the virtual page number (vpn), enabling selection of a page table of 256 m entries. ? shown at the bottom of figure 5-6 is the virtual address space in which the page size is 16 mb and the offset is 24 bits. the 16 bits excluding the asid field represent the vpn, enabling selection of a page table of 64 k entries. figure 5-6. virtual address translation in 64-bit addressing mode virtual address for 4 kb page 256 m (2 28 ) 71 64 63 62 61 12 11 0 28 bits = 256 m page tlb pfn offset 35 0 tlb offset asid vpn 0 or ? 1 40 39 36-bit physical address 71 64 63 62 61 24 23 0 16 bits = 64 k page offset asid vpn 0 or ? 1 40 39 virtual address for 16 mb page 64 k (2 16 ) virtual-to-physical address translation with the tlb the offset is used for the physical address without being changed. the offset is used for the physical address without being changed. virtual-to-physical address translation with the tlb note note note user, supervisor, or kernel address space is selected by bits 63 and 62 of the virtual address. remark bits 35 to 32 of the physical address are not output in the 32-bit bus mode.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 98 5.4 virtual address space the address space of the cpu is extended in memory management system, by translating huge virtual memory addresses into physical addresses. the v r 5500 has three types of virtual address spaces: user, supervisor, and kernel. the addressing mode of each of these virtual address spaces can be set to 32-bit or 64-bit mode. in the 32-bit addressing mode, a virtual address is 32 bits wide, and the maximum user area is 2 gb (2 31 bytes). in the 64-bit addressing mode, the virtual address width is 64 bits and the maximum user area is 1 tb (2 40 bytes). the virtual address is extended with an address space identifier (asid) (refer to figures 5-5 and 5-6 ), which reduces the frequency of tlb flushing when switching contexts. this 8-bit asid is in the cp0 entryhi register, and the global (g) bit is in the entrylo0 and entrylo1 registers, described later in this chapter. when the system interface is in the 32-bit bus mode, the v r 5500 uses 32-bit physical addresses. consequently, the physical address space is 4 gb. in the 64-bit bus mode, the physical address space is 128 gb because the v r 5500 uses 36-bit physical address. caution if the system interface of the v r 5500 is in the 32-bit bus mode, an address error exception does not occur and physical addresses are processed with bits 35 to 32 ignored, even if the space is referenced so that bits 35 to 32 of the physical address are a value other than 0.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 99 5.4.1 user mode virtual address space in user mode, a 2 gb (2 31 bytes) virtual address space (useg) can be used in 32-bit addressing mode. in 64-bit addressing mode, a 1 tb (2 40 bytes) virtual address space (xuseg) can be used. useg and xuseg can be referenced via the tlb. whether a cache is used or not is determined for each page by the tlb entry (depending on the c bit setting in the tlb entry). the user address space can be accessed in supervisor mode and kernel mode. the user segment starts at address 0 and the current active user process resides in either useg (in 32-bit addressing mode) or xuseg (in 64-bit addressing mode). the v r 5500 operates in user mode when the status register contains the following bit-values. ? ksu field = 10 ? exl bit = 0 ? erl bit = 0 in addition, the ux bit in the status register selects addressing mode as follows. ? when ux bit = 0: 32-bit useg space is selected. a tlb mismatch is processed by the 32-bit tlb refill exception handler. ? when ux bit = 1: 64-bit xuseg space is selected. a tlb mismatch is processed by the 64-bit xtlb refill exception handler. figure 5-7 shows user mode address mapping and table 5-4 lists the characteristics of the user segments.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 100 figure 5-7. user mode address space 0 x f f f f f f f f 0 x 8 0 0 0 0 0 0 0 0 x 7 f f f f f f f 0 x 0 0 0 0 0 0 0 0 32-bit mode address error 2 gb with tlb mapping useg 0 x f f f f f f f f f f f f f f f f 0 x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 0 0 0 0 0 0 f f f f f f f f f f 0 x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 64-bit mode address error 1 tb with tlb mapping xuseg remark when a 2 ? s complement overflow occurs in the address calculation, the calculated address is invalid and the result is not defined. table 5-4. 32-bit and 64-bit user mode segments status register bit value segment address range size addressing mode address bit value ksu exl erl ux name 32-bit a31 = 0 any 0 0 0 useg 0x0000 0000 to 0x7fff ffff 2 gb (2 31 bytes) 64-bit a(63:40) = 0 0 0 1 xuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff 1 tb (2 40 bytes)
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 101 (1) useg (32-bit mode) when the ux bit of in the status register is 0 and the most significant bit of the virtual address is 0, this virtual address space is labeled useg. any attempt to reference an address with the most-significant bit of 1 causes an address error exception (refer to chapter 6 exception processing ). (2) xuseg (64-bit mode) when the ux bit of the status register is 1 and bits 63 to 40 of the virtual address are all 0, this virtual address space is labeled xuseg, and 1 terabyte (2 40 bytes) of the user address space can be used. any attempt to reference an address with bits 63 to 40 equal to 1 causes an address error exception (refer to chapter 6 exception processing ). 5.4.2 supervisor mode virtual address space supervisor mode layers the execution of operating systems. kernel operating systems at the highest layer are executed in kernel mode, and the rest of the operating system is executed in supervisor mode. suseg, sseg, xsuseg, xsseg, and csseg (all the spaces) can be referenced via the tlb. whether a cache is used or not is determined for each page by the tlb entry (depending on the c bit setting in the tlb entry). the supervisor address space can be accessed in kernel mode. the processor operates in supervisor mode when the status register contains the following bit-values. ? ksu field = 01 ? exl bit = 0 ? erl bit = 0 in addition, the sx bit in the status register selects addressing mode as follows. ? when sx bit = 0: 32-bit supervisor space a tlb mismatch is processed by the 32-bit tlb refill exception handler. ? when sx bit = 1: 64-bit supervisor space a tlb mismatch is processed by the 64-bit xtlb refill exception handler. figure 5-8 shows supervisor mode address mapping and table 5-5 lists the characteristics of the segments in supervisor mode.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 102 figure 5-8. supervisor mode address space 0 x f f f f f f f f f f f f f f f f 0 x f f f f f f f f e 0 0 0 0 0 0 0 0 x f f f f f f f f d f f f f f f f 0 x f f f f f f f f c 0 0 0 0 0 0 0 0 x f f f f f f f f b f f f f f f f 0 x 4 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 3 f f f f f f f f f f f f f f f 0 x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 3 f f f f f f f f f f f f f f f 0 x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 0 0 0 0 0 0 f f f f f f f f f f 0 x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 64-bit mode address error 0.5 gb with tlb mapping address error 1 tb with tlb mapping address error 1 tb with tlb mapping xsseg csseg xsuseg 0 x f f f f f f f f 0 x e 0 0 0 0 0 0 0 0 x d f f f f f f f 0 x c 0 0 0 0 0 0 0 0 x b f f f f f f f 0 x 0 0 0 0 0 0 0 0 32-bit mode address error address error 2 gb with tlb mapping sseg suseg 0.5 gb with tlb mapping 0 x 8 0 0 0 0 0 0 0 0 x 7 f f f f f f f remark when a 2 ? s complement overflow occurs in the address calculation, the calculated address is invalid and the result is not defined.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 103 table 5-5. 32-bit and 64-bit supervisor mode segments status register bit value segment address range size addressing mode address bit value ksu exl erl sx name 32-bit a31 = 0 01 or 00 0 0 0 suseg 0x0000 0000 to 0x7fff ffff 2 gb (2 31 bytes) a(31:29) = 110 01 or 00 0 0 0 sseg 0xc000 0000 to 0xdfff ffff 512 mb (2 29 bytes) 64-bit a(63:62) = 00 01 or 00 0 0 1 xsuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff 1 tb (2 40 bytes) a(63:62) = 01 01 or 00 0 0 1 xsseg 0x 4000 0000 0000 0000 to 0x4000 00ff ffff ffff 1 tb (2 40 bytes) a(63:62) = 11 01 or 00 0 0 1 csseg 0xffff ffff c 000 0000 to 0xffff ffff dfff ffff 512 mb (2 29 bytes) (1) suseg (32-bit supervisor mode, user space) when the sx bit of the status register is 0 and the most-significant bit of the virtual address space is 0, the suseg virtual address space is selected; it covers 2 gb (2 31 bytes) of the current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. (2) sseg (32-bit supervisor mode, supervisor space) when the sx bit of the status register is 0 and the higher 3 bits of the virtual address space are 110, the sseg virtual address space is selected; it covers 512 mb (2 29 bytes) of the current supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. (3) xsuseg (64-bit supervisor mode, user space) when the sx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 00, the xsuseg virtual address space is selected; it covers 1 tb (2 40 bytes) of the current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. (4) xsseg (64-bit supervisor mode, current supervisor space) when the sx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 01, the xsseg virtual address space is selected; it covers 1 tb (2 40 bytes) of the current supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. (5) csseg (64-bit supervisor mode, separate supervisor space) when the sx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 11, the csseg virtual address space is selected; it covers 512 mb (2 29 bytes) of the separate supervisor virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 104 5.4.3 kernel mode virtual address space if the status register satisfies any of the following conditions, the processor runs in kernel mode. ? ksu = 00 ? exl = 1 ? erl = 1 the addressing width in kernel mode varies according to the state of the kx bit of the status register, as follows. ? when kx = 0: 32-bit kernel space is selected. a tlb mismatch is processed by the 32-bit tlb refill exception handler. ? when kx = 1: 64-bit kernel space is selected. a tlb mismatch is processed by the 32-bit xtlb refill exception handler. the processor enters kernel mode whenever an exception is detected and it remains in kernel mode until an exception return (eret) instruction is executed and results in erl and/or exl = 0. the eret instruction restores the processor to the mode existing prior to the exception. kernel mode virtual address space is divided into regions differentiated by the higher bits of the virtual address, as shown in figure 5-9. table 5-6 lists the characteristics of the 32-bit kernel mode segments, and table 5-7 lists the characteristics of the 64-bit kernel mode segments.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 105 figure 5-9. kernel mode address space 64-bit mode 0.5 gb with tlb mapping 0.5 gb with tlb mapping 0.5 gb without tlb mapping, uncached 0.5 gb without tlb mapping, cacheable address error with tlb mapping address error 1 tb with tlb mapping address error 1 tb with tlb mapping ckseg3 cksseg ckseg1 ckseg0 xkseg xkuseg xksseg 32-bit mode 0.5 gb with tlb mapping 0.5 gb without tlb mapping, cacheable 2 gb with tlb mapping kseg3 ksseg kuseg 0.5 gb with tlb mapping kseg1 kseg0 0.5 gb without tlb mapping, uncached without tlb mapping (see figure 5-10 ) xkphys 0 x f f f f f f f f 0 x e 0 0 0 0 0 0 0 0 x 0 f f f f f f f 0 x c 0 0 0 0 0 0 0 0 x b f f f f f f f 0 x a 0 0 0 0 0 0 0 0 x 9 f f f f f f f 0 x 8 0 0 0 0 0 0 0 0 x 7 f f f f f f f f 0 x 0 0 0 0 0 0 0 0 0 x f f f f f f f f f f f f f f f f 0 x f f f f f f f f e 0 0 0 0 0 0 0 0 x f f f f f f f f d f f f f f f f 0 x f f f f f f f f c 0 0 0 0 0 0 0 0 x f f f f f f f f b f f f f f f f 0 x f f f f f f f f a 0 0 0 0 0 0 0 0 x f f f f f f f f 9 f f f f f f f 0 x f f f f f f f f 8 0 0 0 0 0 0 0 0 x f f f f f f f f 7 f f f f f f f 0 x c 0 0 0 0 0 f f 8 0 0 0 0 0 0 0 0 x c 0 0 0 0 0 f f 7 f f f f f f f 0 x c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x b f f f f f f f f f f f f f f f 0 x 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 7 f f f f f f f f f f f f f f f 0 x 4 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 4 0 0 0 0 0 f f f f f f f f f f 0 x 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 3 f f f f f f f f f f f f f f f 0 x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 x 0 0 0 0 0 0 f f f f f f f f f f 0 x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 remark when a 2 ? s complement overflow occurs in the address calculation, the calculated address is invalid and the result is not defined.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 106 figure 5-10. xkphys area address space address error 64 gb without tlb mapping, cacheable, writeback address error 64 gb without tlb mapping, uncached address error 64 gb without tlb mapping, cacheable, write-through address error address error 64 gb without tlb mapping, uncached, accelerated 64 gb without tlb mapping, cacheable, writeback address error 64 gb without tlb mapping, cacheable, write-through address error address error reserved reserved 0 x b f f f f f f f f f f f f f f f 0 x b 8 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x b 8 0 0 0 0 0 f f f f f f f f f 0 x b 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x b 7 f f f f f f f f f f f f f f 0 x b 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x b 0 0 0 0 0 0 f f f f f f f f f 0 x b 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x a f f f f f f f f f f f f f f f 0 x a 8 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x a 8 0 0 0 0 0 f f f f f f f f f 0 x a 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x a 7 f f f f f f f f f f f f f f 0 x a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x a 0 0 0 0 0 0 f f f f f f f f f 0 x a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 9 f f f f f f f f f f f f f f f 0 x 9 8 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x 9 8 0 0 0 0 0 f f f f f f f f f 0 x 9 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 9 7 f f f f f f f f f f f f f f 0 x 9 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x 9 0 0 0 0 0 0 f f f f f f f f f 0 x 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 8 f f f f f f f f f f f f f f f 0 x 8 8 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x 8 8 0 0 0 0 0 f f f f f f f f f 0 x 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x 8 7 f f f f f f f f f f f f f f 0 x 8 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 x 8 0 0 0 0 0 0 f f f f f f f f f 0 x 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 107 table 5-6. 32-bit kernel mode segments status register bit value address bit value ksu exl erl kx segment name virtual address physical address size a31 = 0 0 kuseg 0x0000 0000 to 0x7fff ffff tlb map 2 gb (2 31 bytes) a(31:29) = 100 0 kseg0 0x8000 0000 to 0x9fff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(31:29) = 101 0 kseg1 0xa000 0000 to 0xbfff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(31:29) = 110 0 ksseg 0xc 000 0000 to 0xdfff ffff tlb map 512 mb (2 29 bytes) a(31:29) = 111 ksu = 00 or exl = 1 or erl = 1 0 kseg3 0xe000 0000 to 0xffff ffff tlb map 512 mb (2 29 bytes) (1) kuseg (32-bit kernel mode, user space) when the kx bit of the status register is 0 and the most-significant bit of the virtual address space is 0, the kuseg virtual address space is selected; it is the current 2 gb (2 31 bytes) user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to kuseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry. if the erl bit of the status register is 1, the user address space is assigned 2 gb (2 31 bytes) without tlb mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached. (2) kseg0 (32-bit kernel mode, kernel space 0) when the kx bit of the status register is 0 and the higher 3 bits of the virtual address space are 100, the kseg0 virtual address space is selected; it is the current 512 mb (2 29 bytes) physical space. references to kseg0 are not mapped through tlb; the physical address selected is defined by subtracting 0x8000 0000 from the virtual address. the k0 field of the config register controls cacheability (see 5.5.8 config register (16) ). (3) kseg1 (32-bit kernel mode, kernel space 1) when the kx bit of the status register is 0 and the higher 3 bits of the virtual address space are 101, the kseg1 virtual address space is selected; it is the current 512 mb (2 29 bytes) physical space. references to kseg1 are not mapped through tlb; the physical address selected is defined by subtracting 0xa000 0000 from the virtual address. caches are disabled for accesses to these addresses, and main memory (or memory-mapped i/o device registers) is accessed directly.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 108 (4) ksseg (32-bit kernel mode, supervisor space) when the kx bit of the status register is 0 and the higher 3 bits of the virtual address space are 110, the ksseg virtual address space is selected; it is the current 512 mb (2 29 bytes) virtual address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to ksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry. (5) kseg3 (32-bit kernel mode, kernel space 3) when the kx bit of the status register is 0 and the higher 3 bits of the virtual address space are 111, the kseg3 virtual address space is selected; it is the current 512 mb (2 29 bytes) kernel virtual space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to kseg3 are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry. table 5-7. 64-bit kernel mode segments status register bit value segment virtual address physical address size address bit value ksu exl erl kx name a(63:62) = 00 1 xkuseg 0x0000 0000 0000 0000 to 0x0000 00ff ffff ffff tlb map 1 tb (2 40 bytes) a(63:62) = 01 1 xksseg 0x 4000 0000 0000 0000 to 0x4000 00ff ffff ffff tlb map 1 tb (2 40 bytes) a(63:62) = 10 1 xkphys 0x8000 0000 0000 0000 to 0xbfff ffff ffff ffff 0x0000 0000 0000 to 0x000f ffff ffff 2 36 bytes (see (8) ) a(63:62) = 11 1 xkseg 0xc 000 0000 0000 0000 to 0xc000 00ff 7fff ffff tlb map 2 40 to 2 31 bytes a(63:62) = 11, a(63:31) = ? 1 1 ckseg0 0xffff ffff 8000 0000 to 0xffff ffff 9fff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(63:62) = 11, a(63:31) = ? 1 1 ckseg1 0xffff ffffa000 0000 to 0xffff ffff bfff ffff 0x0000 0000 to 0x1fff ffff 512 mb (2 29 bytes) a(63:62) = 11, a(63:31) = ? 1 1 cksseg 0xffff ffff c 000 0000 to 0xffff ffff dfff ffff tlb map 512 mb (2 29 bytes) a(63:62) = 11, a(63:31) = ? 1 ksu = 00 or exl = 1 or erl = 1 1 ckseg3 0xffff ffff e000 0000 to 0xffff ffff ffff ffff tlb map 512 mb (2 29 bytes)
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 109 (6) xkuseg (64-bit kernel mode, user space) when the kx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 00, the xkuseg virtual address space is selected; it is the 1 tb (2 40 bytes) current user address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to xkuseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. if the erl bit of the status register is 1, the user address space is assigned 2 gb (2 31 bytes) without tlb mapping and becomes unmapped (with virtual addresses being used as physical addresses) and uncached. (7) xksseg (64-bit kernel mode, normal supervisor space) when the kx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 01, the xksseg address space is selected; it is the 1 tb (2 40 bytes) normal supervisor address space. the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address. references to xksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page?s tlb entry. (8) xkphys (64-bit kernel mode, physical spaces) when the kx bit of the status register is 1and bits 63 and 62 of the virtual address space are 10, the virtual address space is called xkphys and one of the 8 spaces of the unmapped area is selected. internally, bits 35 to 0 of the virtual address are used for the physical address as is. if any of bits 58 to 32 of the address is 1, an attempt to access that address results in an address error. bits 61 to 59 of the virtual address indicate the cache usability of each space and its attribute (algorithm). table 5-8 shows cache algorithm corresponding to 8 address spaces. table 5-8. cache algorithm and xkphys address space bits 61 to 59 cache usability and algorithm address 0 reserved 0x8000 0000 0000 0000 to 0x8000 000f ffff ffff 1 cacheable, write-through, write-allocated 0x8800 0000 0000 0000 to 0x8800 000f ffff ffff 2 uncached 0x9000 0000 0000 0000 to 0x9000 000f ffff ffff 3 cacheable, writeback 0x9800 0000 0000 0000 to 0x9800 000f ffff ffff 4 cacheable, write-through, write-allocated 0xa000 0000 0000 0000 to 0xa000 000f ffff ffff 5 cacheable, writeback 0xa800 0000 0000 0000 to 0xa800 000f ffff ffff 6 reserved 0xb000 0000 0000 0000 to 0xb000 000f ffff ffff 7 uncached, accelerated 0xb800 0000 0000 0000 to 0xb800 000f ffff ffff
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 110 (9) xkseg (64-bit kernel mode, physical spaces) when the kx bit of the status register is 1 and bits 63 and 62 of the virtual address space are 11, the virtual address space is called xkseg and selected as either of the following. ? kernel virtual space xkseg, the current kernel virtual space; the virtual address is extended with the contents of the 8-bit asid field to form a unique virtual address references to xkseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry. ? one of the four 32-bit kernel compatibility spaces, as described in the next section. (10) 64-bit kernel mode compatible spaces (ckseg0, ckseg1, cksseg, and ckseg3) if the conditions listed below are satisfied in kernel mode, ckseg0, ckseg1, cksseg, or ckseg3 (each having 512 mb) is selected as a compatible space according to the state of the bits 30 and 29 (lower 2 bits) of the address. ? the kx bit of the status register is 1. ? bits 63 and 62 of the 64-bit virtual address are 11. ? bits 61 to 31 of the virtual address are 0xfff ffff. (a) ckseg0 this space is an unmapped area, compatible with the 32-bit mode kseg0 space. the k0 field of the config register controls cacheability and coherency. (refer to 5.5.8 config register (16) ). (b) ckseg1 this space is an unmapped and uncached area, compatible with the 32-bit mode kseg1 space. (c) cksseg this space is the ordinaty supervisor virtual space, compatible with the 32-bit mode ksseg space. references to cksseg are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry. (d) ckseg3 this space is the kernel virtual space, compatible with the 32-bit mode kseg3 space. references to ckseg3 are mapped through tlb. whether cache can be used or not is determined by bit c of each page ? s tlb entry.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 111 5.5 memory management registers the cp0 registers used for managing the memory are described below. the memory management registers are listed in table 5-9. each register has a unique identification number that is referred to as its register number. cp0 registers not listed below are used for exception processing (refer to chapter 6 exception processing for details). table 5-9. cp0 memory management registers register name register no. index register 0 random register 1 entrylo0 register 2 entrylo1 register 3 pagemask register 5 wired register 6 entryhi register 10 prid register 15 config register 16 lladdr register note 17 taglo register 28 taghi register 29 note this register is defined to preserve compatibility with other v r series products and has no actual operation. with the v r 5500, the hardware automatically avoids a hazard that occurs when a tlb or cp0 register is changed, except when settings related to instruction fetch are made. for the hazards related to instruction fetch, refer to chapter 19 instruction hazards .
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 112 5.5.1 index register (0) the index register is a 32-bit, readable/writable register containing five lower bits to index an entry in the tlb. the most-significant bit of the register shows the success or failure of a tlb probe (tlbp) instruction. the index field also specifies the tlb entry affected by tlb read (tlbr) or tlb write index (tlbwi) instructions. if the tlbp instruction has been successful, the index of the tlb entry that matches the contents of the entryhi register is set to the index field. since the contents of the index register after reset are undefined, initialize this register via software. figure 5-11. index register 31 0 p index 5 0 6 30 p: indicates whether probing is successful or not. it is set (1) if the latest tlbp instruction fails. it is cleared (0) when the tlbp instruction is successful. index: specifies an index to a tlb entry that is a target of the tlbr or tlbwi instruction. 0: reserved. write 0 to these bits. zero is returned when these bits are read. 5.5.2 random register (1) the random register is a read-only register. the lower 6 bits are used in referencing a tlb entry. this register is decremented each time an instruction is executed. the values that can be set in the register are as follows. ? the lower bound is the content of the wired register. ? the upper bound is 47. the random register specifies the entry in the tlb that is affected by the tlb write random (tlbwr) instruction. the register can be read to verify proper operation of the processor. the random register is set to the value of the upper boundary upon cold reset. this register is also set to the upper boundary when the wired register is written. figure 5-12. random register 31 0 random 5 0 6 random: tlb random index 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 113 5.5.3 entrylo0 (2) and entrylo1 (3) registers the entrylo register consists of two registers that have identical formats: the entrylo0 register, used for even pages and the entrylo1 register, used for odd pages. the entrylo0 and entrylo1 registers are both read-/write- accessible. they are used to access the lower bits of the on-chip tlb. when a tlb read/write operation is carried out, the entrylo0 and entrylo1 registers accesses the contents of the lower bits of tlb entries at even and odd addresses, respectively. since the contents of these registers after reset are undefined, initialize these registers via software. figure 5-13. entrylo0 and entrylo1 registers 0 pfn c d v g 0 1 2 3 65 30 29 31 entrylo0 32-bit mode 0 pfn c d v g 0 1 2 3 65 30 29 31 entrylo1 32-bit mode 0 pfn c d v g 0 1 2 3 65 30 29 63 entrylo0 64-bit mode 0 pfn c d v g 0 1 2 3 65 30 29 63 entrylo1 64-bit mode pfn: page frame number; higher bits of the physical address. c: specifies the page attribute of the tlb entry (refer to table 5-10 ). d: dirty. if this bit is set to 1, the page is writable. this bit is actually a write-protect bit that software can use to prevent alteration of data. v: valid. if this bit is set to 1, it indicates that the tlb entry is valid; if an entry with this bit 0 is hit, a tlb invalid exception (tlbl or tlbs) occurs. g: global. if this bit is set in both the entrylo0 and entrylo1 registers, then the processor ignores the asid during tlb lookup. 0: reserved. write 0 to these bits. zero is returned when these bits are read. caution if the system interface of the v r 5500 is in the 32-bit bus mode, an address error exception does not occur and physical addresses are processed with bits 35 to 32 ignored, even if the space is referenced so that bits 35 to 32 of the physical address are a value other than 0.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 114 the c bit specifies whether the cache is used when a page is referenced. to use the cache, select an algorithm from ?writeback? or ?write-through, write-allocated?. table 5-10 shows the page attributes selected by the c bit. table 5-10. cache algorithm value of c bit cache algorithm 0 reserved 1 cacheable, write-through, write-allocated 2 uncached 3 cacheable, writeback 4 cacheable, write-through, write-allocated, unguarded 5 cacheable, writeback, unguarded 6 reserved 7 uncached, accelerated ?unguarded? means enabling a speculative refill operation to the external memory before a speculatively issued load/store instruction is committed if a data cache miss occurs because of the instruction. therefore, the unguarded attribute is valid only for the data cache.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 115 5.5.4 pagemask register (5) the pagemask register is a readable/writable register used for reading from or writing to the tlb; it holds a comparison mask that sets the page size for each tlb entry, as shown in table 5-11. page sizes can be set from 1 kb to 256 kb in five ways. tlb read/write operation uses this register as either a source or a destination; bits 30 to 13 that are targets of comparison are masked during address translation. since the contents of the pagemask register after reset are undefined, initialize this register via software. table 5-11 lists the mask pattern for each page size. if the mask pattern is one not listed below, the tlb operates unexpectedly. figure 5-14. pagemask register 31 0 0 mask 30 0 12 13 mask: page comparison mask, which determines the virtual page size for the corresponding entry. 0: reserved. write 0 to these bits. zero is returned when these bits are read. table 5-11. mask values and page sizes bit page size 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 4 kb 000000000000000000 16 kb 000000000000000011 64 kb 000000000000001111 256 kb 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 mb 000000000011111111 4 mb 000000001111111111 16 mb 000000111111111111 64 mb 000011111111111111 256 mb 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 gb 111111111111111111
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 116 5.5.5 wired register (6) the wired register is a readable/writable register that specifies the lower boundary of the random entry of the tlb. wired entries cannot be overwritten by a tlbwr instruction. they can, however, be overwritten by a tlbwi instruction. random entries can be overwritten by both instructions. figure 5-15. positions indicated by wired register tlb 47 0 range of random entries value specified by the wired register range of wired entries the wired register is cleared to 0 after reset. writing this register also sets the random register to the value of its upper boundary (see 5.5.2 random register (1) ). figure 5-16. wired register 31 0 wired 5 0 6 wired: specifies tlb wired boundary 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 117 5.5.6 entryhi register (10) the entryhi register is a writable register and is used to access the higher bits of the tlb. the entryhi register holds the higher bits of a tlb entry for tlb read/write operations. if a tlb refill, tlb invalid, or tlb modified exception occurs, the entryhi register is set with the virtual page number (vpn2) and the asid for a virtual address where an exception occurred. see chapter 6 exception processing for details of tlb exceptions. the asid is used to read from or write to the asid field of the tlb entry. it is also checked with the asid of the tlb entry as the asid of the virtual address during address translation. the entryhi register is accessed by the tlbp, tlbwr, tlbwi, and tlbr instructions. figure 5-17. entryhi register 31 vpn2 asid 0 12 13 0 7 8 32-bit mode 63 fiii asid vpn2 61 62 0 64-bit mode 39 40 12 13 7 8 r 0 vpn2: virtual page number divided by two (mapping to two pages) asid: 8-bit address space id field. this field enables the tlb to be shared by several processes. the virtual address of each process may be duplicated. r: space type (00 user, 01 supervisor, 11 kernel). matches bits 63 and 62 of the virtual address. fill: reserved. ignored on write. zero is returned when these bits are read. 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 118 5.5.7 prid (processor revision id) register (15) the 32-bit, read-only processor revision id (prid) register contains information identifying the implementation and revision level of the cpu and cp0. figure 5-18. prid register 31 0 rev 0 15 16 7 8 imp imp: cpu processor id number (0x55 for the v r 5500) rev: cpu processor revision number 0: reserved. write 0 to these bits. zero is returned when these bits are read. the processor revision number is stored as a value in the form yx, where y is a major revision number in bits 7 to 4 and x is a minor revision number in bits 3 to 0. the processor revision number can distinguish some revisions of the chip, however there is no guarantee that changes to the chip will necessarily be reflected in the prid register, or that changes to the revision number necessarily reflect real chip changes. therefore, create a program that does not depend on the processor revision number field. 5.5.8 config register (16) the config register indicates/sets various statuses of processors on the v r 5500. bits 31 to 28 and 21 to 3 are set by hardware after reset. these are read-only bits, and their status when accessed by software can be checked. bits 27 to 22 and 2 to 0 are readable/writable and can be manipulated by software. since these bits are undefined after reset, initialize these bits via software.
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 119 figure 5-19. config register (1/2) 31 0 ec ep em 11 ew 0 be 1 1 0 011 011 1 1 0 k0 1 30 27 28 23 24 21 22 19 20 17 16 15 14 13 12 11 9 8 6 5 4 3 2 0 18 ec: sets the division ratio of the system clock to pclock. 000 divided by 2 001 divided by 2.5 010 divided by 3 011 divided by 3.5 100 divided by 4 101 divided by 4.5 110 divided by 5 111 divided by 5.5 ep: sets the transfer rate of block write data. the number of data words differs depending on the bus mode of the system interface (the transfer pattern is the same). ? 32-bit bus mode 0000 dddddddd (1 word/1 cycle) 0001 ddxddxddxddx (2 words/3 cycles) 0010 ddxxddxxddxxddxx (2 words/4 cycles) 0011 dxdxdxdxdxdxdxdx (2 words/4 cycles) 0100 ddxxxddxxxddxxxddxxx (2 words/5 cycles) 0101 ddxxxxddxxxxddxxxxddxxxx (2 words/6 cycles) 0110 dxxdxxdxxdxxdxxdxxdxxdxx (2 words/6 cycles) 0111 ddxxxxxxddxxxxxxddxxxxxxddxxxxxx (2 words/8 cycles) 1000 dxxxdxxxdxxxdxxxdxxxdxxxdxxxdxxx (2 words/8 cycles) other reserved ? 64-bit bus mode 0000 dddd (1 doubleword/1 cycle) 0001 ddxddx (2 doublewords/3 cycles) 0010 ddxxddxx (2 doublewords/4 cycles) 0011 dxdxdxdx (2 doublewords/4 cycles) 0100 ddxxxddxxx (2 doublewords/5 cycles) 0101 ddxxxxddxxxx (2 doublewords/6 cycles) 0110 dxxdxxdxxdxx (2 doublewords/6 cycles) 0111 ddxxxxxxddxxxxxx (2 doublewords/8 cycles) 1000 dxxxdxxxdxxxdxxx (2 doublewords/8 cycles) other reserved
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 120 figure 5-19. config register (2/2) em: sets sysad bus timing mode. the mode that can be selected differs depending on the bus mode of the system interface. ? in normal mode 00 v r 4000 compatible mode 01 reserved 10 pipeline write mode 11 write re-issuance mode ? in out-of-order return mode 00, 10 pipeline mode 01, 11 re-issuance mode ew: sets sysad bus mode (bus width). 00 64-bit bus mode 01 32-bit bus mode other reserved be: sets big-endian mode. 0 little endian 1 big endian k0: sets cache algorithm of kseg0. 001 cacheable, write-through, write-allocated 010 uncached 011 cacheable, writeback 100 cacheable, write-through, write-allocated, unguarded 101 cacheable, writeback, unguarded 111 uncached, accelerated other reserved 1: 1 is returned when read. 0: 0 is returned when read.
chapter 5 memory management system preliminary user ? s manual u16044ej1v0um 121 5.5.9 lladdr (load linked address) register (17) the lladdr register is a read/write register and indicates the physical address that was read by the last ll instruction. this register is used only for diagnostic purposes. the paddr field indicates the physical address pa(35:4) that is read when the ll instruction is executed. the contents of the lladdr register after reset are undefined. figure 5-20. lladdr register 31 paddr 0 paddr: bits 35 to 4 of physical address read by last ll instruction
chapter 5 memory management system preliminary user?s manual u16044ej1v0um 122 5.5.10 taglo (28) and taghi (29) registers the taglo and taghi registers are 32-bit readable/writable registers that hold the cache tag during cache initialization, cache diagnostics, or cache error processing. the tag registers are written by the cache and mtc0 instructions. the contents of these registers after reset are undefined. figure 5-21. taglo and taglo registers 31 ptaglo 0 1 5 taglo 31 0 0 taghi 6 7 8 pstate p 4 r 3 l0 ptaglo: specifies physical address bits 31 to 10. pstate: indicates the status of the cache. 00 invalid 10 clean 11 dirty other reserved l: sets the cache line lock. 0 not locked 1 locked r: specifies the way of the cache that is a candidate for replacement. the candidate for replacement is determined by the lru algorithm. 0 way 0 1 way 1 p: even parity bit for the cache tag 0: reserved. write 0 to these bits. zero is returned when these bits are read. the index_store_tag operation of the cache instruction writes the value of the p bit of the taglo register to the p bit of the cache tag as is (parity is not calculated). an operation other than the index_store_tag operation that changes the contents of the cache writes the value of the parity calculated by the processor to the p bit of the cache tag. the index_load_tag operation of the cache instruction writes the value of the p bit of the target cache tag to the p bit of the taglo register.
preliminary user?s manual u16044ej1v0um 123 chapter 6 exception processing this chapter describes cpu exception processing, including an explanation of the hardware that processes exceptions. for details of fpu exceptions, see chapter 8 floating-point exceptions . 6.1 exception processing operation the processor receives exceptions from a number of sources, including translation lookaside buffer (tlb) misses, arithmetic overflows, i/o interrupts, and system calls. when the cpu detects an exception, the normal sequence of instruction execution is suspended and the processor enters kernel mode (refer to chapter 5 memory management system for a description of system operating modes). the processor then disables interrupts and moves control for execution to the exception handler (fixed at a specific address as an exception processing routine implemented by software). for the exception handler, save the state of the processor, including the contents of the program counter, the current operating mode (user or supervisor), statuses, and interrupt enable. these can be restored when the exception has been processed. when an exception occurs, the cpu loads the exception program counter (epc) register with an address where execution can restart after the exception has been processed. the restart address in the epc register is the address of the instruction that caused the exception or, if the instruction was being executed in a branch delay slot, the address of the branch instruction preceding the delay slot. in addition, registers that hold address, cause, and status information during exception processing are also available. for details, refer to 6.2 exception processing registers . for details of exception processing, refer to 6.4 details of exceptions .
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 124 6.2 exception processing registers this section explains the cp0 registers that are used in exception processing. table 6-1 lists these registers, along with their number-each register has a unique identification number that is referred to as its register number. the cp0 registers not listed in the table are used in memory management (for details, see chapter 5 memory management system ). the exception handler examines the cp0 registers during exception processing to determine the cause of the exception and the state of the cpu at the time the exception occurred. table 6-1. cp0 exception processing registers register name register no. context register 4 badvaddr register 8 count register 9 compare register 11 status register 12 cause register 13 epc register 14 watchlo register 18 watchhi register 19 xcontext register 20 performance counter register 25 parity error register 26 cache error register 27 errorepc register 30 with the v r 5500, the hardware automatically avoids a hazard that occurs when a tlb or cp0 register is changed, except when settings related to instruction fetch are made. for the hazards related to instruction fetch, refer to chapter 19 instruction hazards .
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 125 6.2.1 context register (4) the context register is a read-/write-accessible register and indicates an entry in the page table entry (pte) array in the memory. this array shows the operating system structure, and stores the virtual-to-physical address table. when a tlb miss occurs, the operating system loads the unsuccessfully translated entry from the pte to the tlb. the context register is used by the tlb refill exception handler for loading tlb entries. the context register duplicates some of the information provided in the badvaddr register, but the information is arranged in a form that is more useful for a tlb exception handler. the contents of the context register after reset are undefined. figure 6-1. context register 31 ptebase 0 badvpn2 22 23 0 3 4 32-bit mode 63 ptebase 0 0 64-bit mode 22 23 3 4 badvpn2 ptebase: base address of the page table entry. badvpn2: this field holds the value obtained by halving the virtual page number of the most recent virtual address for which translation failed. 0: reserved. write 0 to these bits. zero is returned when these bits are read. the ptebase field is used only by the operating system as the pointer to the current pte array on the memory. the 19-bit badvpn2 field contains bits 31 to 11 of the virtual address that caused the tlb miss; bit 10 is excluded because a single tlb entry maps to an even-odd page pair. for a 4 kb page size, this format can directly address the pair-table of 8-byte ptes. when the page size is 16 kb or more, shifting or masking this value produces the correct pte reference address.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 126 6.2.2 badvaddr register (8) the bad virtual address (badvaddr) register is a read-only register that saves the most recent virtual address that failed to have a valid translation, or that had an addressing error. figure 7-2 shows the format of the badvaddr register. if an address error occurs as a result of an instruction fetch in the 64-bit mode and a virtual address is stored in the badvaddr register, all of bits 58 to 40 are 0 or 1. the contents of the badvaddr register after reset are undefined. caution this register saves no information after a bus error exception, because it is not an address error exception. figure 6-2. badvaddr register 31 badvaddr 0 32-bit mode 63 0 64-bit mode badvaddr badvaddr: most recent virtual address for which an addressing error occurred, or for which address translation failed.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 127 6.2.3 count register (9) the readable/writable count register acts as a timer. it is incremented in synchronization with the frequency of 1/2 pclock, regardless of the instruction execution or pipeline progress status. this register is a free-running type. when the register reaches all 1, it rolls over to 0 at the next event and continues incrementing. this register is used for self-diagnostic test, system initialization, or the establishment of inter-process synchronization. the contents of the count register after reset are undefined. figure 6-3. count register 31 0 count count: most recent count value. 6.2.4 compare register (11) the compare register causes a timer interrupt; it holds a value but does not change on its own. when the value of the count register (see 6.2.3 count register (9) ) equals the value of the compare register, the ip7 bit in the cause register is set. when the ip7 bit is set, this causes an interrupt as soon as the interrupt is enabled. writing a value to the compare register, as a side effect, clears the timer interrupt request. for diagnostic purposes, the compare register is a read/write register. normally, this register should be only used for a write. the contents of the compare register after reset are undefined. figure 6-4. compare register format 31 0 compare compare: value that is compared with the count value of the count register.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 128 6.2.5 status register (12) the status register is a readable/writable register that contains the operating mode, the interrupt enabling, and diagnostic states of the processor. figure 6-5. status register 31 0 27 ds 28 26 25 24 15 16 7 8 6 5 4 2 3 1 cu(2:0) 0 fr 0 kx sx ux ie ksu erl exl im(7:0) 30 xx xx: enables use of the mips iv instruction set in the user mode (0 disables use, 1 enables use). cu: enables use of three coprocessors (0 disables use, 1 enables use). in the kernel mode, cp0 can be always used regardless of the cu0 bit. cp2 is reserved for future expansion. fr: number of floating-point registers usable (0 16, 1 32) ds: self-diagnosis status field (see figure 6-6 .) im: interrupt mask. enables external, internal, coprocessor, and software interrupts (0 disables, 1 enables). this field consists of 8 bits and controls eight interrupts. each interrupt is allocated to the corresponding bit of this field as follows. im7: masks timer interrupts or int5# and external write requests. im(6:2): masks ordinary external interrupts (int(4:0)# and external write request). im(1:0): masks software interrupts. kx: enables 64-bit addressing in kernel mode (0 32-bit, 1 64-bit). if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the kernel mode address space. in addition, 64-bit operations are always valid in kernel mode. sx: enables 64-bit addressing and operation in supervisor mode (0 32-bit, 1 64-bit). if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the supervisor mode address space. ux: enables 64-bit addressing and operation in user mode (0 32-bit, 1 64-bit). if this bit is set, an xtlb refill exception occurs if a tlb miss occurs in the user mode address space. ksu: sets and indicates the operating mode (10 user, 01 supervisor, 00 kernel). erl: sets and indicates the error level (0 normal, 1 error). exl: sets and indicates the exception level (0 normal, 1 exception). ie: sets and indicates interrupt enabling/disabling (0 disabled, 1 enabled). 0: rfu. write 0 to this bit. zero is returned when this bit is read.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 129 figure 6-6 shows the details of the diagnostic status (ds) field. figure 6-6. status register diagnostic status field 0 ts sr 0 ch ce de bev dme 23 21 20 19 18 17 16 22 24 dme: enables setting of debug mode (0 disables, 1 enables). bev: specifies base address of tlb refill exception vector and general-purpose exception vector (0 normal, 1 bootstrap). ts: occurrence of tlb shutdown (0 does not occur, 1 occurs) this bit is used to avoid an adverse effect if two or more tlb entries match the same virtual address. when this bit is set (1), a tlb refill exception occurs. tlb shutdown also occurs if the tlb entry that matches a virtual address is invalidated (by clearing the v bit of the entry). sr: occurrence of soft reset or nmi (0 does not occur, 1 occurs) ch: condition bit of cp0 (0 false, 1 true). this bit can be read or written only by software and is not affected by hardware. ce: when this bit is 1, the contents of the parity error register are used to set or change the check bit of the cache (see 6.2.4 ). de: enables exception occurrence in case of cache parity error (0 enables, 1 disables). 0: reserved. write 0 to this bit. 0 is returned if this bit is read. the field of the status register that sets the mode and access status is explained next. (1) interrupt enable interrupts are enabled when all of the following conditions are true: ? ie is set to 1. ? exl is cleared to 0. ? erl is cleared to 0. ? the appropriate bit of the im is set to 1.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 130 (2) operating modes the following status register bit settings are required for user, kernel, and supervisor modes. ? the processor is in the user mode when the ksu field is 10, the exl bit is 0, and the erl bit is 0. ? the processor is in the supervisor mode when the ksu field is 01, the exl bit is 0, and the erl bit is 0. ? the processor is in the kernel mode when the ksu field is 00, the exl bit is 1, or the erl bit is 1. accessing the kernel address space is enabled only in the kernel mode. accessing the supervisor address space is enabled in the supervisor mode and kernel mode. accessing the user address space is enabled in all modes. (3) addressing mode the following status register bit settings select 32- or 64-bit operation for user, kernel, and supervisor operating modes. enabling 64-bit operation permits the execution of 64-bit opcodes and translation of 64-bit addresses. 64-bit operation for user, kernel and supervisor modes can be set independently. ? 64-bit addressing for the kernel mode is enabled when the kx bit is 1. 64-bit operations are always valid in the kernel mode. if a tlb miss occurs in the kernel mode address space when this bit is set, an xtlb refill exception occurs. ? 64-bit addressing and operations are enabled for the supervisor mode when the sx bit = 1. if a tlb miss occurs in the supervisor mode address space when this bit is set, an xtlb refill exception occurs. ? 64-bit addressing and operations are enabled for the user mode when the ux bit = 1. if a tlb miss occurs in the user mode address space when this bit is set, an xtlb refill exception occurs. (4) status at reset at reset, the contents of the status register are undefined except for the following bits. ? the sr bit is 0 when a cold reset is executed and is 1 when a soft reset is executed or an nmi occurs. ? erl bit = 1 and bev bit = 1
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 131 6.2.6 cause register (13) the 32-bit readable/writable cause register holds the cause of the most recent exception. a 5-bit in the exception code field indicates one of the exception causes (see table 6-2 ). other bits hold the detailed information of the specific exception. all bits in the cause register, excepting the ip1 and ip0 bits, are read-only; ip1 and ip0 are used for software interrupts. the contents of the cause register after reset are undefined. figure 6-7. cause register 31 0 0 30 0 29 15 16 8 6 7 1 2 exccode 0 ip(7:0) 27 28 ce bd 0 bd: indicates whether the most recent exception occurred in the branch delay slot (1 in delay slot, 0 normal). ce: indicates the coprocessor number in which a coprocessor unusable exception occurred. this field will remain undefined for as long as no coprocessor unusable exception occurs. ip: indicates whether an interrupt is pending (1 no interrupt pending, 0 no interrupt). interrupt requests are assigned to the bits as follows. ip7: timer interrupt request (int5# and external write request) ip(6:2): normal interrupt requests (int(4:0)# and external write request) ip(1:0): software interrupt requests. these bits generate a software interrupt when they are set to 1 by software. exccode: exception code field (see table 6-2 for details). 0: reserved. write 0 to these bits. zero is returned when these bits are read. eight interrupt requests are provided in the v r 5500, and requests states are reflected in ip(7:0). for details of interrupt function, refer to chapter 16 interrupts . ? ? ? ? ip7 this bit indicates a timer interrupt request, assertion of the interrupt request pin int5#, and the occurrence of an interrupt due to an external write request. it is set when the contents of the count register are equal to those of the compare register, when the performance counter overflows, when the int#5 signal is asserted, or when data is written to an internal register by an external write request. whether the timer interrupt request, int5# signal, or interrupt request generated by the external write request is used is specified by the tintsel signal at reset. ? ? ? ? ip(6:2) bits ip(6:2) reflect the logical sum of two internal registers. one of the registers latches the status of interrupt request pins int(4:0)# in each cycle. data is written to the other register by the external write request of the system interface. ? ? ? ? ip1, ip0 a software interrupt request can be set or cleared by manipulating bits ip1 and ip0.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 132 the following table describes the exception codes. table 6-2. exception codes exccode mnemonic description 0 int interrupt exception 1 mod tlb modified exception 2 tlbl tlb refill exception (load or instruction fetch) 3 tlbs tlb refill exception (store) 4 adel address error exception (load or instruction fetch) 5 ades address error exception (store) 6 ibe bus error exception (instruction fetch) 7 dbe bus error exception (data load or store) 8 sys system call exception 9 bp breakpoint exception 10 ri reserved instruction exception 11 cpu coprocessor unusable exception 12 ov operation overflow exception 13 tr trap exception 14 ? reserved 15 fpe floating-point exception 16-22 ? reserved 23 watch watch exception 24-31 ? reserved to indicate the cause of the floating-point exception in detail, the exception code included in the floating-point control/status register is used (refer to chapter 8 floating-point exceptions ).
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 133 6.2.7 epc (exception program counter) register (14) the epc (exception program counter) register is a readable/writable register that contains the address at which processing resumes after an exception has been processed, as shown below. ? virtual address of the instruction that directly caused the exception. ? virtual address of the preceding branch or jump instruction (when the instruction associated with the exception is in a branch delay slot, and the bd bit in the cause register is set (1)). ? virtual address of the instruction immediately after the wait instruction when the standby mode is released by an interrupt exception immediately after execution of the wait instruction if an address error exception due to instruction fetch occurs and a virtual address is stored in the epc register in the 64-bit mode, all of bits 58 to 40 are cleared to 0 or set to 1. the exl bit in the status register is set (1) to keep the processor from overwriting the address of the exception- causing instruction contained in the epc register in the event of another exception. the contents of the epc register after reset are undefined. figure 6-8. epc register 31 epc 0 32-bit mode 63 0 64-bit mode epc epc: address for a program to be restarted after exception processing.
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 134 6.2.8 watchlo (18) and watchhi (19) registers the v r 5500 can detect a request to reference the physical address specified by the watchlo and watchhi registers. this function can also be used as a debugging function to generate a watch exception at the execution of a load/store instruction. since the contents of these registers after reset are undefined, initialize these registers via software. figure 6-9. watchlo and watchhi registers 31 0 paddr0 2 3 0 1 rw 31 0 paddr1 43 0 watchlo watchhi paddr1: bits 35 to 32 of physical address. paddr0: bits 31 to 3 of physical address. r: enables an exception occurrence when a load instruction is executed (0 enables, 1 disables). w: enables an exception occurrence when a store instruction is executed (0 enables, 1 disables). 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 135 6.2.9 xcontext register (20) the readable/writable xcontext register indicates an entry in the page table entry (pte), an operating system data structure that stores virtual-to-physical address translations. if a tlb miss occurs, the operating system loads the untranslated data from the pte into the tlb to handle the software error. the xcontext register is used by the xtlb refill exception handler to load tlb entries in 64-bit addressing mode. the xcontext register duplicates some of the information provided in the badvaddr register, and puts it in a form useful for the xtlb exception handler. this register is included solely for operating system use. the operating system sets the ptebase field in this register, as needed. the contents of the xcontext register after reset are undefined. figure 6-10. xcontext register 63 0 ptebase 0 3 4 32 33 30 31 badvpn2 r ptebase: the ptebase field is a base address of the page table entry. r: address space type (00 user, 01 supervisor, 11 kernel). the setting of this field matches virtual address bits 63 and 62. badvpn2: virtual address for which translation is invalid (bits 39 to 13). 0: reserved. write 0 to these bits. zero is returned when these bits are read. only the operating system uses the ptebase field as a pointer to the current pte array on memory. the r field is written by hardware in case of a tlb miss. the 27-bit badvpn2 field has bits 39 to 11 of the virtual address that caused the tlb refill; bit 12 is excluded because a single tlb entry maps to an even-odd page pair. for a 4 kb page size, this register format can be used as a pointer that references the pair-table of 8-byte ptes. when the page size is 16 kb or more, shifting or masking this value produces the appropriate pte reference address.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 136 6.2.10 performance counter register (25) the performance counter register consists of four registers: two counter registers and two control registers. each register is a 32-bit read/write register. the v r 5500 uses the performance counter register to count the number of events that have occurred in the processor, and can generate a timer interrupt request when the performance counter register overflows. a counter register is incremented when an event specified by a control register occurs. the two counter registers correspond to the two control registers, and each counter register operates independently of each other. the control register specifies an event to count, the mode at that time, and enables occurrence of an interrupt request. when a counter register overflows, the ip7 bit of the cause register is set if the control register enables occurrence of an interrupt. even after the counter register overflows, it continues counting regardless of whether an interrupt request is reported. when a cold reset is executed, the contents of all these registers are initialized to 0. the contents of these registers are retained after a warm reset. figure 6-11. performance counter register 31 0 count 2 3 s 1 k exl 31 0 0 counter register control register 4 5 ie u 6 10 11 event ip 9 ce count: performance count value ce: enables performance count. event: sets an event to count (refer to table 6-3 ). ip: indicates occurrence of an interrupt. this bit is set (1) if the counter register overflows. writing 0 to this bit clears the interrupt request. ie: enables occurrence of an interrupt. when this bit is set (1), the ip7 bit of the cause register is set (1) if the counter register overflows. u: when this bit is set (1), counting is performed if an event occurs in the user mode. s: when this bit is set (1), counting is performed if an event occurs in the supervisor mode. k: when this bit is set (1), counting is performed if an event occurs in the kernel mode and if the erl and exl bits are 0. exl: when this bit is set (1), counting is performed if an event occurs in the kernel mode and if the exl bit is 0. 0: reserved. write 0 to these bits. 0 is returned if these bits are read.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 137 table 6-3 shows the setting of the event field. table 6-3. events to count event field event 0 processor clock cycle 1 instruction execution 2 execution of load/prefetch/cache instruction 3 execution of store instruction 4 execution of branch instruction 5 execution of floating-point instruction 6 doubleword flush to main memory 7 tlb refill 8 data cache miss 9 instruction cache miss 10 branch prediction miss 11-15 reserved remark if execution of an instruction is set as an event, it is assumed that the instruction is executed when it causes an exception, and the instruction is counted as an event.
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 138 6.2.11 parity error register (26) the parity error register reads/writes the data parity bit of the cache for initializing the cache, self-diagnosis, and error processing. the parity is read to the parity error register by the cache instruction index_load_tag. if the ce bit of the status register is set, the contents of the parity error register are written instead of the parity to the data cache by a store instruction and to the instruction cache by the fill operation of the cache instruction. the contents of the parity error register are undefined at reset. figure 6-12. parity error register 31 0 0 parity 7 8 parity: parity bit of cache data. ? for data cache bit 0: even parity for the least significant byte bit 1: even parity for the second least significant byte bit 2: even parity for the third least significant byte bit 3: even parity for the fourth least significant byte bit 4: even parity for the fourth most significant byte bit 5: even parity for the third most significant byte bit 6: even parity for the second most significant byte bit 7: even parity for the most significant byte ? for instruction cache bit 0: even parity for the lower word bit 1: even parity for the higher word 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 139 6.2.12 cache error register (27) the cache error register is a 32-bit read-only register and indicates the status of a parity error in the cache. the parity error cannot be corrected. the cache error register has cache index bits that indicate the cause of an error, and status bits. the contents of the cache error register after reset are undefined. figure 6-13. cache error register 0 30 29 ec ed 28 27 et es 26 25 ee eb 24 0 31 er er: type of cache (0 instruction, 1 data) ec: cache level of error (0 internal, 1 reserved) ed: indicates whether a data area error has occurred (0 no error, 1 error). et: indicates whether a tag area error has occurred (0 no error, 1 error). es: set if an error occurs in the first doubleword. ee: set if an error occurs on the sysad bus. eb: set if a data error occurs in addition to an instruction error (indicated by other bit). if this bit is set, it indicates that flushing is required for the data cache after the instruction error has been processed. 0: reserved. write 0 to these bits. zero is returned when these bits are read.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 140 6.2.13 errorepc register (30) the errorepc (error exception program counter) register is similar to the epc register. it is used to store the program counter value at which the reset, soft reset, nmi, or cache error exception has been processed. the readable/writable errorepc register holds any of the following virtual address at which instruction execution can resume after servicing an error. ? virtual address of the instruction that directly caused the exception. ? virtual address of the preceding branch or jump instruction (when the instruction associated with the exception is in a branch delay slot, and the bd bit in the cause register is set (1)). ? virtual address of the instruction immediately after the wait instruction when the standby mode is released by a reset, soft reset, nmi, or cache error exception immediately after execution of the wait instruction there is no branch delay slot indication for the errorepc register. figure 6-14. errorepc register 31 errorepc 0 32-bit mode 63 errorepc 0 64-bit mode errorepc: program counter that indicates the restart address after a reset, soft reset, nmi, or cache error exception.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 141 6.3 details of exceptions if an exception occurs in the processor, the exl bit of the status register is set to 1, and the system enters the kernel mode. usually, the ksu field of the status register is reset to 00 and the exl bit is reset to 0 by an exception handler to enable occurrence of an exception in the exception handler after information has been saved. re-set the exl bit to 1 using the exception handler so that the saved information is not lost by any other exception while it is being restored. when the exception processing has been completed, the setting of the ksu field before the occurrence of the exception is restored and the exl bit is reset to 0. for details, refer to the description of the eret instruction in chapter 17 cpu instruction set . remark if both the exl and erl bits of the status register are 0, the user mode, supervisor mode, or kernel mode is selected as the operating mode, depending on the value of the ksu field of the status register. if either of the exl or erl bit is 1, the processor enters the kernel mode. 6.3.1 exception types exceptions are classified as the following types, according to the internal status of the processor retained when an exception occurs. ? reset exceptions ? soft reset exceptions (nmi exception) ? cache error exceptions ? processor exceptions other than above (general exceptions) when an exception occurs, the registers in the processor are set as follows (1) reset exceptions t: undefined random tlbentries ? 1 wired 0 config 0 || ec || undefined 6 || 110110 || be || 110011011110 || undefined 3 errorepc pc sr undefined 9 || 1 || undefined 19 || 1 || undefined 2 performancecounter 0 pc 0xffff ffff bfc0 0000 (2) soft reset and nmi exceptions t: errorepc pc sr sr 31:23 || 1 || sr 21 || 1 || sr 19:3 || 1 || sr 1:0 pc 0xffff ffff bfc0 0000
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 142 (3) cache error exceptions t: errorepc pc cacheerr er || ec || ed || et || es || ee || eb || 0 25 sr sr 31:3 || 1 || sr 1:0 if sr 22 = 1 then /* when the bev bit is set to 1 */ pc 0xffff ffff bfc0 0200 + 0x100 /* access to the rom area */ else pc 0xffff ffff a000 0000 + 0x100 /* access to the main memory area */ endif (4) general exceptions t: cause bd || 0 || ce || 0 12 || cause 15:8 || exccode || 0 2 if sr 1 = 0 then /* user or supervisor mode when exception processing is not in progress */ epc pc endif sr sr 31:2 || 1 || sr 0 if sr 22 = 1 then /* when the bev bit is set to 1 */ pc 0xffff ffff bfc0 0200 + vector /* access to the uncached area */ else pc 0xffff ffff 8000 0000 + vector /* access to the cache area */ endif
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 143 6.3.2 exception vector address if an exception occurs, an exception vector address is set to the program counter, and processor ? s processing branches from the main program. locate a program that processes the exception (exception handler) at the position of the exception vector address. the vector address is the sum of a base address and a vector offset. the vector address differs depending on the type of exception. 64-/32-bit mode exception vectors and their offset values are shown below. table 6-4. 32-bit mode exception vector addresses exception vector base address (virtual address) vector offset reset, soft reset, nmi 0xbfc0 0000 (bev bit is automatically set to 1) 0x0000 cache error 0xa000 0000 (bev = 0) 0xbfc0 0200 (bev = 1) 0x0100 tlb mismatch, exl = 0 0x0000 xtlb mismatch, exl = 0 0x0080 other 0x8000 0000 (bev = 0) 0xbfc0 0200 (bev = 1) 0x0180 table 6-5. 64-bit mode exception vector addresses exception vector base address (virtual address) vector offset reset, soft reset, nmi 0xffff ffff bfc0 0000 (bev bit is automatically set to 1) 0x0000 cache error 0xffff ffff a000 0000 (bev = 0) 0xffff ffff bfc0 0200 (bev = 1) 0x0100 tlb mismatch, exl = 0 0x0000 xtlb mismatch, exl = 0 0x0080 other 0xffff ffff 8000 0000 (bev = 0) 0xffff ffff bfc0 0200 (bev = 1) 0x0180 ? ? ? ? vector of reset, soft reset, and nmi exception the vector address (virtual) of each of the reset, soft reset, and nmi exceptions is in the kseg1 (uncached, non-tlb mapping) area. ? ? ? ? vector of cache error exception the vector address (virtual) of the cache error exception is in the kseg1 (uncached, non-tlb mapping) area. ? ? ? ? vector of tlb refill exception (exl = 0) when the bev bit is 0, the vector address (virtual) of this exception is in the kseg0 (cacheable, non-tlb mapping) area. when the bev bit is 1, the vector address (virtual) of this exception is in kseg1 (uncached, non-tlb mapping) area.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 144 ? ? ? ? vector of general exception when the bev bit is 0, the vector address (virtual) of this exception is in the kseg0 (cacheable, non-tlb mapping) area. when the bev bit is 1, the vector address (virtual) of this exception is in kseg1 (uncached, non-tlb mapping) area. (1) selecting tlb refill exception vector the isa of mips iii or later has the following two tlb refill exception vectors. ? for referencing 32-bit address space (tlb mismatch) ? for referencing 64-bit address space (xtlb mismatch) the tlb mismatch vector is selected in accordance with the addressing space (user, supervisor, or kernel) of the address that has generated a tlb miss, and the value of the corresponding extension addressing bits (ux, sx, or kx) of the status register. except when it has something to do with specifying the address space in which the address exists, the current operating mode of the processor is not important. the context register and xcontext register are completely different page table pointer registers. each indicates a different page table and is used for refilling. no matter which tlb exception (refill exception, invalid exception, tlbl exception, or tlbs exception) occurs, the address is loaded to the badvpn2 field of both the registers in the same way as the v r 4000. remark unlike the v r 5500, the v r 4000 selects a vector in accordance with the current operating mode of the processor (user, supervisor, or kernel) and the value of the corresponding extension addressing bit (ux, sx, or kx) of the status register. the context register and xcontext register are provided not as completely separate registers, but share the ptebase field. if a mismatch occurs at a specific address, a tlb refill exception or xtlb refill exception occurs, depending on the source of reference. unless a mismatch handler decodes the address and selects a page table, only one page table can be used. table 6-6 shows the addresses that generate tlb mismatches and the position of the tlb refill exception vector according to the corresponding mode bit.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 145 table 6-6. tlb refill exception vector space virtual address range area exception vector kernel 0xffff ffff e000 0000 to 0xffff ffff ffff ffff kseg3 tlb mismatch (kx = 0) or xtlb mismatch (kx = 1) supervisor 0xffff ffff c000 0000 to 0xffff ffff dfff ffff sseg, ksseg tlb mismatch (sx = 0) or xtlb mismatch (sx = 1) kernel 0xc000 0000 0000 0000 to 0xc000 0ffe ffff ffff xkseg xtlb mismatch (kx = 1) supervisor 0x4000 0000 0000 0000 to 0x4000 0fff ffff ffff xsseg, xksseg xtlb mismatch (sx = 1) user 0x0000 0000 8000 0000 to 0x0000 0fff ffff ffff xsuseg, xuseg, xkuseg xtlb mismatch (ux = 1) user 0x0000 0000 0000 0000 to 0x0000 0000 7fff ffff useg, xuseg, suseg, xsuseg, kuseg, xkuseg tlb mismatch (ux = 0) or xtlb mismatch (ux = 1)
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 146 6.3.3 priority of exceptions when more than one exception occurs for a single instruction, only the exception with the highest priority is selected for processing. table 6-7 lists the priorities. table 6-7. exception priority order priority exception high low cold reset soft reset nmi debug break (instruction fetch) address error (instruction fetch) tlb/xtlb refill (instruction fetch) tlb invalid (instruction fetch) cache error (instruction fetch) bus error (instruction fetch) system call breakpoint coprocessor unusable reserved instruction trap integer overflow floating-point debug break (data access) address error (data access) tlb/xtlb refill (data access) tlb invalid (data access) tlb modified (data write) cache error (data access) bus error (data access) watch interrupt (other than nmi) hereafter, handling exceptions by hardware is referred to as ?process?, and handling exception by software is referred to as ?service?.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 147 6.4 details of exceptions 6.4.1 reset exception (1) cause the reset exception occurs when the coldreset# signal goes from active to inactive. this exception is not maskable. (2) processing the special interrupt vector for reset exception is used. ? in 32-bit mode: 0xbfc0 0000 (virtual address) ? in 64-bit mode: 0xffff ffff bfc0 0000 (virtual address) the reset exception vector resides in unmapped and uncached areas, so the hardware need not initialize the tlb or the cache to process this exception. it also means the processor can fetch and execute instructions while the caches and virtual memory are in an undefined state. when this exception occurs, the contents of all registers are undefined except for the following registers. ? sr bit of the status register is cleared (0). ? erl and bev bits of the status register are set (1). ? the random register is set to the value of its upper bound (47). ? the wired register is initialized to 0. ? the performance counter register is initialized to 0. ? some bits of the config register are set in accordance with the input status of the initialization interface signal. (3) servicing the reset exception is serviced by: ? initializing all processor registers, coprocessor registers, tlb, caches, and the memory system ? performing diagnostic tests ? bootstrapping the operating system
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 148 6.4.2 soft reset exception (1) cause a soft reset occurs inactive while the reset# signal goes from active to inactive when the coldreset# signal remains. this exception is not maskable. (2) processing the special interrupt vector for reset exception (same location as reset) is used. ? in 32-bit mode: 0xbfc0 0000 (virtual address) ? in 64-bit mode: 0xffff ffff bfc0 0000 (virtual address) this vector is located within unmapped and uncached areas, so that the hardware need not initialize the tlb or the cache to process this exception. the sr bit of the status register is set to 1 to distinguish this exception from a reset exception. when this exception occurs, the contents of all registers are saved except for the following registers. ? the program counter value at which an exception occurs is set to the errorepc register. ? erl, sr, and bev bits of the status register are set (1). during a soft reset, access to the cache or system interface may be aborted. this means that the contents of the cache and memory will be undefined if a soft reset occurs. (3) servicing the soft reset exception is serviced by: ? saving the current processor states for diagnostic tests ? reinitializing the system in the same way as for a reset exception
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 149 6.4.3 nmi exception (1) cause the nmi (non-maskable interrupt) exception occurs when the signal input to the nmi# pin becomes active. it can also be generated by writing 1 to bit 6 of the internal interrupt register from an external source via sysad6. this exception is not maskable; it occurs regardless of the settings of the exl, erl, and ie bits of the status register (2) processing the special interrupt vector for nmi exception is used. ? in 32-bit mode: 0xbfc0 0000 (virtual address) ? in 64-bit mode: 0xffff ffff bfc0 0000 (virtual address) this vector is located within unmapped and uncached areas so that the hardware need not initialize an nmi exception. the sr bit of the status register is set (1) to distinguish this exception from a reset exception. because the nmi exception can occur even while another exception is being processed, program execution cannot be continued after the nmi exception has been processed. nmi occurs only at instruction boundaries. the states of the caches and memory system are saved by this exception. when this exception occurs, the contents of all registers are saved except for the following registers. ? the program counter value at which an exception occurs is set to the errorepc register. ? the erl, sr, and bev bits of the status register are set (1). (3) servicing the nmi exception is serviced by: ? saving the current processor states for diagnostic tests ? reinitializing the system in the same way as for a reset exception
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 150 6.4.4 address error exception (1) cause the address error exception occurs when an attempt is made to execute one of the following. this exception is not maskable. ? execution of the lw or sw instruction for word data that is not located on a word boundary ? execution of the lh or sh instruction for halfword data that is not located on a halfword boundary ? execution of the ld or sd instruction for doubleword data that is not located on a doubleword boundary ? referencing the kernel address space in user or supervisor mode ? referencing the supervisor space in user mode ? fetching an instruction that does not located on a word boundary ? referencing the address error space ? referencing the supervisor or kernel address space in supervisor or kernel mode using an address whose bit 31 is not sign-extended to bits 32 to 63 in 32-bit mode (2) processing the general exception vector is used for this exception. the adel or ades code in the cause register is set. if this exception has been caused by an instruction reference or load operation, adel is set. if it has been caused by a store operation, ades is set. when this exception occurs, the badvaddr register stores the virtual address that was not properly aligned or was referenced in protected address space. the contents of the vpn field of the context and entryhi registers are undefined, as are the contents of the entrylo register. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing the kernel reports the unix tm sigsegv (segmentation violation) signal to the current process, and this exception is usually fatal.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 151 (4) restrictions (a) with v r 5500 ver. 1.x, when the return address (contents of the epc register) to which execution is to return from an exception handler by executing the eret instruction is in the address error area, a value different from the contents of the program counter is stored in the epc register if an interrupt occurs immediately after execution of the eret instruction. this restriction does not apply to ver. 2.0 or later. (b) with v r 5500 ver. 2.0 or later, if a jump/branch instruction is located two instructions before the boundary with the address error space and if a branch prediction miss (including ras miss), eret instruction commitment, exception (except the address error exception mentioned) does not occur (is not committed) between execution of the above jump/branch instruction and occurrence (commitment) of an address error exception due to a specific cause (refer below), the address stored in the badvaddr register by the processing of the above address error exception is the address at the position (boundary with the address space) two instructions after the jump/branch instruction. however, the correct address is stored in the epc register. therefore, do not locate a jump/branch instruction at the position two instructions before the boundary with the address space. this restriction applies to the following causes of the address error exception. ? if an attempt is made to fetch an instruction in the kernel address space in the user or supervisor mode ? if an attempt is made to fetch an instruction in the supervisor address space in the user mode ? if an attempt is made to fetch an instruction not located at the word boundary ? if an attempt is made to reference the address error space in the kernel mode this restriction is included in the specifications of the v r 5500. caution with the v r 5500, bits 58 to 40 of an address that is different from the actual value of the program counter are stored in the badvaddr register and epc register if an address error exception occurs as a result of an execution jump to the address error space in the 64-bit mode. if an address error exception occurs, therefore, do not reference the badvaddr and epc registers. however, if an address error exception occurs because execution is made to jump to the address error space by the jr or jalr instruction, an incorrect address is stored in the epc register as mentioned above, but the same value as the program counter is stored in the badvaddr register.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 152 6.4.5 tlb exceptions three types of tlb exceptions can occur. ? tlb refill exception ? tlb invalid exception ? tlb modified exception the following three sections describe these tlb exceptions. (1) tlb refill exception (32-bit mode)/xtlb refill exception (64-bit mode) (a) cause the tlb refill exception occurs when there is no tlb entry matching the address to be referenced, or when there are multiple tlb entries to matching the address to be referenced. this exception is not maskable. (b) processing there are two special exception vectors for this exception; one for 32-bit addressing mode, and one for 64- bit addressing mode. the ux, sx, and kx bits of the status register determine which vector to use, depending on either 32-bit or 64-bit space is used for the user, supervisor or kernel mode. when the exl bit of the status register is set to 0, either of these two special vectors is referenced. when the exl bit is set to 1, the general exception vector is referenced. this exception sets the tlbl or tlbs code in the exccode field of the cause register. if this exception has been caused by an instruction reference or load operation, tlbl is set. if it has been caused by a store operation, tlbs is set. when this exception occurs, the badvaddr, context, xcontext, and entryhi registers hold the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the random register normally contains a valid location in which to place the replacement tlb entry. the contents of the entrylo register are undefined. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (c) servicing to service this exception, the contents of the context or xcontext register are used as a virtual address to load memory words containing the physical page frame and access control bits for a pair of tlb entries. the memory word is written into the tlb entry by using the entrylo0, entrylo1, or entryhi register. if the address to be referenced matches two or more entries (tlb shutdown), also clear the ts bit of the status register to 0. it is possible that the physical page frame and access control bits are placed in a page where the virtual address is not resident in the tlb. this condition is processed by allowing a tlb refill exception in the tlb refill exception handler. in this case, the general exception vector is used because the exl bit of the status register is set (1).
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 153 (2) tlb invalid exception (a) cause the tlb invalid exception occurs when the tlb entry that matches with the virtual address to be referenced is invalid (v bit is 0). this exception is not maskable. (b) processing the general exception vector is used for this exception. the tlbl or tlbs code in the exccode field of the cause register is set. if this exception has been caused by an instruction reference or load operation, tlbl is set. if it has been caused by a store operation, tlbs is set. when this exception occurs, the badvaddr, context, xcontext, and entryhi registers contain the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the random register normally stores a valid location in which to place the replacement tlb entry. the contents of the entrylo register are undefined. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (c) servicing usually, the v bit of a tlb entry is cleared in the following cases. ? when a virtual address does not exist ? when the virtual address exists, but is not in main memory (a page fault) ? when a trap is required on any reference to the page (for example, to maintain a reference bit) after servicing the cause of a tlb invalid exception, the tlb entry location is identified with a tlbp (tlb probe) instruction, and replaced by another entry with setting (1) its v bit.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 154 (3) tlb modified exception (a) cause the tlb modified exception occurs when the tlb entry that matches with the virtual address referenced by the store instruction is valid (v bit is 1) but is not writable (d bit is 0). this exception is not maskable. (b) processing the general exception vector is used for this exception, and the mod code in the exccode field of the cause register is set. when this exception occurs, the badvaddr, context, xcontext, and entryhi registers hold the virtual address that failed address translation. the entryhi register also contains the asid from which the translation fault occurred. the contents of the entrylo register are undefined. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (c) servicing the kernel uses the failed virtual address or virtual page number to identify the corresponding access control bits. the page identified may or may not permit write accesses; if writes are not permitted, a write protection violation occurs. if write accesses are permitted, the page frame is marked dirty (writable) by the kernel in its own data structures. the tlbp instruction places the index of the tlb entry that must be altered into the index register. the word data containing the physical page frame and access control bits (with setting (1) the d bit) is loaded to the entrylo register, and the contents of the entryhi and entrylo registers are written into the tlb.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 155 6.4.6 cache error exception (1) cause if a parity error of the cache is detected, a cache error exception occurs. this exception can be masked by the de bit of the status register. when an instruction or data is read from an external source, the timing of the cache error exception differs depending on the data transfer format. when a block is transferred, only an error in the first word is checked. if an error is found in the first word, therefore, the exception immediately occurs. if an error is in the other words, however, the exception occurs when the processor uses that data. during single transfer, the exception occurs as soon as an error is found in the data. (2) processing the processor sets the erl bit of the status register to 1, saves the exception restart address of the errorepc register, and transfers information to the following special vector in a space where the cache cannot be used. ? when bev bit = 0, the vector is 0xffff ffff a000 0100 ? when bev bit = 1, the vector is 0xffff ffff bfc0 0300 (3) servicing all errors must be logged. to correct a parity error, the system makes the cache block invalid by using the cache instruction, overwrites old data via a cache miss, and resumes execution by using the eret instruction. any other data is uncorrectable and may be fatal to the current process. caution because the data cache of the v r 5500 has a non-blocking structure, a cache error exception occurs asynchronously. even if a cache miss occurs, the subsequent instructions can be executed as long as they are not dependent upon the line where the miss occurred. therefore, the value of the program counter when the cache error exception occurs is not always the address of the instruction that has caused the exception. consequently, resuming execution from the instruction responsible for the exception is not guaranteed even if the system restores from the exception by using the eret instruction.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 156 6.4.7 bus error exception (1) cause a bus error exception is raised by board-level circuitry for events such as bus time-out, local bus parity errors, and invalid physical memory addresses or access types. this exception is not maskable. when an instruction or data is read from an external source, the timing of the bus error exception differs depending on the data transfer format. when a block is transferred, only an error in the first word is checked. if an error is found in the first word, therefore, the exception immediately occurs. if an error is in the other words, however, the exception occurs when the processor uses that data. during single transfer, the exception occurs as soon as an error is found in the data. (2) processing the general interrupt vector is used for a bus error exception. the ibe or dbe code in the exccode field of the cause register is set. if the cause of the exception is an instruction reference (instruction fetch), ibe is set. if it is a data reference (load/store instruction), dbe is set. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing the physical address at which the fault occurred can be computed from information available in the system control coprocessor (cp0) register. ? if the ibe code in the cause register is set (indicating an instruction fetch), the virtual address is stored in the epc register. (4 is added to the contents of the epc register if the bd bit of the cause register is set to 1.) ? if the dbe code is set (indicating a load or store), the virtual address (address of the preceding branch instruction if the bd bit of the cause register is set to 1) of the instruction that caused the exception is stored in the epc register. (4 is added to the contents of the epc register if the bd bit of the cause register is set to 1.) the virtual address of the load and store instruction can then be obtained by interpreting the instruction. the physical address can be obtained by using the tlbp instruction and reading the entrylo register to compute the physical page number. at the time of this exception, the kernel reports the unix sigbus (bus error) signal to the current process, but the exception is usually fatal. caution because the data cache of the v r 5500 has a non-blocking structure, a bus error exception occurs asynchronously. even if a cache miss occurs, the subsequent instructions can be executed as long as they are not dependent upon the line where the miss occurred. therefore, the value of the program counter when the bus error exception occurs is not always the address of the instruction that has caused the exception. consequently, resuming execution from the instruction responsible for the exception is not guaranteed even if the system restores from the exception by using the eret instruction.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 157 6.4.8 system call exception (1) cause a system call exception occurs during an attempt to execute the syscall instruction. this exception is not maskable. (2) processing the general exception vector is used for this exception, and the sys code in the exccode field of the cause register is set. the epc register contains the address of the syscall instruction. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing when this exception occurs, control is moved to the applicable system routine. to resume execution, the epc register must be altered so that the syscall instruction does not re-execute; this is accomplished by adding a value of 4 to the epc register before returning. if a syscall instruction is in a branch delay slot, decoding of the jump or branch instruction for identifying the branch destination is required to resume execution. 6.4.9 breakpoint exception (1) cause a breakpoint exception occurs when an attempt is made to execute the break instruction. this exception is not maskable. (2) processing the general exception vector is used for this exception, and the bp code in the exccode field of the cause register is set. the epc register contains the address of the break instruction. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing when the breakpoint exception occurs, control is moved to the applicable system routine. additional distinctions can be made by analyzing the unused bits of the break instruction (bits 25 to 6), and loading the contents of the instruction whose address the epc register contains (the address at which 4 is added to the contents of the epc register if the break instruction is in a branch delay slot). to resume execution, the epc register must be altered so that the break instruction does not re-execute; this is accomplished by adding a value of 4 to the epc register before returning. if a break instruction is in a branch delay slot, decoding of the branch instruction for identifying the branch destination is required to resume execution.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 158 6.4.10 coprocessor unusable exception (1) cause the coprocessor unusable exception occurs when an attempt is made to execute a coprocessor instruction for either of the following. ? a corresponding coprocessor unit that has not been marked usable (cu0 bit of status register = 0) ? cp0 instructions are executed in user or supervisor mode when the use of cp0 is disabled (the cu0 bit of the status register = 0). this exception is not maskable. (2) processing the general exception vector is used for this exception, and the cpu code in the exccode field of the cause register is set. the ce bit of the cause register indicates which of the four coprocessors was referenced. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing the coprocessor unit to which an attempted reference was made is identified by the ce bit of the cause register. one of the following processing is performed by the handler. (a) if the process is entitled access to the coprocessor, the coprocessor is marked usable and execution is resumed. (b) if the process is entitled access to the coprocessor, but the coprocessor does not exist or has failed, decoding of the coprocessor instruction is possible. (c) if the bd bit in the cause register is set (1), the branch instruction must be decoded; then the coprocessor instruction can be emulated and execution resumed with the epc register advanced passing the coprocessor instruction. (d) if the process is not entitled access to the coprocessor, the kernel reports unix sigill/ill_privin_fault (illegal instruction/privileged instruction fault) signal to the current process, and this exception is fatal.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 159 6.4.11 reserved instruction exception (1) cause the reserved instruction exception occurs when an attempt is made to execute one of the following instructions. ? instruction with an undefined opcode (bits 31 to 26) ? special instruction with an undefined sub opcode (bits 5 to 0) ? regimm instruction with an undefined sub opcode (bits 20 to 16) ? 64-bit instructions in 32-bit user or supervisor mode 64-bit operations are always valid in kernel mode regardless of the value of the kx bit in the status register. this exception is not maskable. (2) processing the general exception vector is used for this exception, and the ri code in the exccode field of the cause register is set. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing all currently defined mips isa instructions can be executed. the process executing at the time of this exception is handled by a unix sigill/ill_resop_fault (illegal instruction/reserved operand fault) signal. this exception is usually fatal. 6.4.12 trap exception (1) cause the trap exception occurs when a tge, tgeu, tlt, tltu, teq, tne, tgei, tgeui, tlti, tltui, teqi, or tnei instruction results in a true condition. this exception is not maskable. (2) processing the general exception vector is used for this exception, and the tr code in the exccode field of the cause register is set. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing at the time of a trap exception, the kernel reports the unix sigfpe/fpe_intovf_trap (floating-point exception/integer overflow) signal to the current process, and this exception is usually fatal.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 160 6.4.13 integer overflow exception (1) cause an integer overflow exception occurs when an add, addi, sub, dadd, daddi, or dsub instruction results in a two ? s complement overflow. this exception is not maskable. (2) processing the general exception vector is used for this exception, and the ov code in the exccode field of the cause register is set. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing at the time of the exception, the kernel reports the unix sigfpe/fpe_intovf_trap (floating-point exception/integer overflow) signal to the current process, and this exception is usually fatal for current process. 6.4.14 floating-point operation exception (1) cause the floating-point exception occurs as a result of an operation of the floating-point coprocessor. this exception cannot be masked. (2) processing this vector uses an ordinary exception vector and the fpe code is set to the exccode field of the cause register. the contents of the floating-point control/status register indicate the cause of this exception. (3) servicing this exception is cleared by clearing the corresponding bit of the floating-point control/status register. if an unimplemented operation exception occurs, the kernel must emulate that instruction. if any other exception occurs, the kernel passes the exception to the user program that has caused the exception.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 161 6.4.15 watch exception (1) cause a watch exception occurs when a load or store instruction references the physical address specified by the watchlo and watchhi registers. the watchlo and watchhi registers specify whether a load or store or both could initiate this exception. ? when the r bit of the watchlo register is set to 1: load instruction ? when the w bit of the watchlo register is set to 1: store instruction ? when both the r bit and w bit of the watchlo register are set to 1: load instruction or store instruction the cache instruction never causes a watch exception. the watch exception is held pending while the exl bit of the status register is set (1). the watch exception can be masked by either setting (1) the exl bit of the status register, or clearing (0) the r and w bits of the watchlo register. (2) processing the general exception vector is used for this exception, and the watch code in the exccode field of the cause register is set. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing the watch exception is a debugging aid; typically the exception handler moves control to a debugger, allowing the user to examine the situation. to continue, mask the watch exception to execute the faulting instruction. the watch exception must then be re-enabled. the faulting instruction can be executed either by the debugger for each instruction or by setting breakpoints. because the contents of the watchlo and watchhi registers become undefined after reset, initialize these registers via software (it is particularly important to clear (0) the r and w bits). if the registers are not initialized, a watch exception may occur.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 162 6.4.16 interrupt exception (1) cause the interrupt exception occurs when one of the eight interrupt sources note is made active. the application of these interrupts differs depending on the system. an interrupt request signal from a pin is detected by the level. each of the eight interrupts can be masked by clearing the corresponding bit in the im field of the status register, and all of the eight interrupts can be masked by clearing the ie bit of the status register. note they are 1 timer interrupt, 5 ordinary interrupts, and 2 software interrupts. remark the timer interrupt request signal is generated if the count register matches the compare register, or if the performance counter overflows. a timer interrupt request, or an interrupt request resulting from asserting the int5# pin or an external write request (sysad5) can be selected as the interrupt source reflected on the ip7 bit of the cause register, depending on the status of the tintsel pin after reset. (2) processing the general exception vector is used for this exception, and the int code is set in the exccode field of the cause register. the ip field of the cause register indicates current interrupt requests. it is possible that more than one of the bits can be simultaneously set (or cleared) if the interrupt request signal is active (inactive) before this register is read. the epc register contains the address of the instruction that caused the exception. however, if this instruction is in a branch delay slot, the epc register contains the address of the preceding branch instruction, and the bd bit of the cause register is set (1). (3) servicing if a timer interrupt request occurs, check the contents of the performance counter to identify whether a match between the count register and compare register or an overflow of the performance counter has caused the interrupt. if the interrupt is caused by one of the two software sources, the interrupt request is cleared by setting the corresponding cause register bit to 0. if the interrupt is caused by hardware, the interrupt source is cleared by deactivating the corresponding interrupt request signal. data may not be stored in an external device until execution of the other instructions in the pipeline is completed because an internal write buffer is provided. therefore, make sure that the data is stored correctly before the instruction that returns execution from the interrupt (eret) is executed. if the data is not stored, the interrupt request processing may be performed again even if there is actually no pending interrupt.
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 163 6.5 exception processing flowcharts the remainder of this chapter contains flowcharts for the following exceptions and servicing for their handlers. ? general exception processing and their exception handlers ? tlb/xtlb refill exception processing and their exception handlers ? cache error exception processing and their exception handlers ? processing of reset, soft reset and nmi exceptions, and their exception handlers
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 164 figure 6-15. general exception processing (1/2) (a) hardware processing exl bit 1 bev bit a set fp control/ status register entryhi vpn2, asid context/xcontext vpn2 set cause register (exccode, ce) set badvaddr register epc (pc ? 4) no ; fp control/status register is set only when a floating-point exception occurs. entryhi and context/xcontext registers are set only when a tlb invalid, tlb modified, tlb refill, or address error exception occurs. ; kernel mode is set and interrupts are disabled. pc 0xffff ffff bfc0 0200 + 180 (unmapped, uncached) pc 0xffff ffff 8000 0000 + 180 (unmapped, cacheable) = 0 (normal) = 1 (bootstrap) bd bit 1 no yes yes no yes start exl bit = 0? exl bit = 0? set badvaddr register epc pc bd bit 0 instruction is in branch delay slot? remark the interrupts can be masked by setting the ie or im bit. the watch exception can be held pending by setting the exl bit to 1.
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 165 figure 6-15. general exception processing (2/2) (b) software processing execute mfc0 instruction context/xcontext epc status cause execute mtc0 instruction (set status register) ksu bit 00 exl bit 0 ie bit 1 servicing of exception routine exl bit 1 execute mtc0 instruction epc status execute eret instruction ; prevent a tlb modified, tlb invalid, or tlb refill exception from occurring by using unmapped area. ; watch and interrupt exceptions are disabled by setting exl bit to 1. ; os/system avoids all other exceptions. ; only reset, soft reset, and nmi exceptions are enabled. ; option: interrupts are enabled in kernel mode. ; after exl bit = 0 is set, all exceptions are enabled (except the interrupt exception masked by the ie and im bit.) ; the register files are saved. ; the execution of the eret instruction is disabled in the branch delay slots for the other jump instructions. ; the processor does not execute an instruction n the branch delay slot for the eret instruction. ; pc epc, exl bit 0, ll bit 0 check the cause register, and jump to each routine a end
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 166 figure 6-16. tlb/xtlb refill exception processing (1/2) (a) hardware processing instruction is in branch delay slot? set badvaddr register epc pc exl bit 1 bev bit exl bit = 0? set badvaddr register epc (pc ? 4) yes ; check for multiple exceptions pc 0xffff ffff bfc0 0200 + vec. off. (unmapped, uncached) pc 0xffff ffff 8000 0000 + vec. off. (unmapped, cacheable) xtlb exception? vec.off. = 0x080 vec. off. = 0x000 set badvaddr register vec.off. = 0x180 ; kernel mode is set and interrupts are disabled. = 0 (normal) = 1 (bootstrap) yes no no exl bit = 0? no yes no yes start entryhi vpn2, asid context/xcontext vpn2 set cause register exccode field ce bit bd bit 1 entryhi vpn2, asid context/xcontext vpn2 set cause register exccode field ce bit bd bit 0 b
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 167 figure 6-16. tlb/xtlb refill exception processing (2/2) (b) software processing execute mfc0 instruction context/xcontext execute eret instruction ; prevent a tlb modified, tlb invalid, or tlb refill exception from occurring by using unmapped area. ; watch and interrupt exceptions are disabled by setting exl bit to 1. ; os/system avoids all other exceptions. ; only reset, soft reset, and nmi exceptions are enabled. ; the physical address for a virtual address that is loaded into the context register is loaded into the entrylo register and written to the tlb. ; ts bit is cleared upon tlb shutdown. ; the execution of the eret is disabled in the branch delay slots for the other jump instructions. ; the processor does not execute an instruction n the branch delay slot for the eret instruction. ; pc epc, exl bit 0, ll bit 0 servicing of exception routine note b end note a tlb refill exception may reoccur while the data/instruction addresses are in the mapping area. if an exception reoccurs, servicing will jump to the general exception vector because the exl bit is 1. in this case, service the tlb miss in the general exception handler, return to the user program using the eret instruction, and generate the tlb refill exception again.
chapter 6 exception proc essing preliminary user?s manual u16044ej1v0um 168 figure 6-17. processing of cache error exception set cache error register erl bit 1 instruction is in branch delay slot? yes bev bit no = 0 (normal) pc 0xffff ffff a000 0000 + 100 (unmapped, uncached) servicing of exception routine errorepc (pc ? 4) = 1 (bootstrap) ; prevent exceptions related to tlb and the cache error exception from occurring by using unmapped and uncached area. ; interrupt exceptions are disabled because erl bit = 1. ; os/system avoids all other exceptions. ; only reset, soft reset, and nmi exceptions are enabled. software hardware errorepc pc end start execute eret instruction pc 0xffff ffff bfc0 0200 + 100 (unmapped, uncached) ; eret is not enabled in branch delay slot of other jump instructions. ; processor does not execute the instruction in the branch delay slot of the eret instruction. ; pc errorepc, erl bit 0, ll bit 0
chapter 6 exception proc essing preliminary user ? s manual u16044ej1v0um 169 figure 6-18. processing of reset/soft reset/nmi exceptions status register setting bev bit 1 sr bit 1 erl bit 1 random 47 wired 0 update bits 31 to 6 of config register. set status register bev bit 1 sr bit 0 erl bit 1 soft reset or nmi exception reset exception pc 0xffff ffff bfc0 0000 nmi? yes sr bit no = 1 servicing of soft reset exception routine servicing of reset exception routine servicing of nmi exception routine eret instruction execution = 0 (option) processor does not make indication to distinguish between nmi and soft reset. indication at the system level is necessary. ; software hardware errorepc pc end
preliminary user?s manual u16044ej1v0um 170 chapter 7 floating-point unit 7.1 overview the floating-point unit (fpu) operates as coprocessor cp1 of the cpu and executes floating-point operation instructions. it can use both single-precision (32-bit) and double-precision (64-bit) data, and can also convert a floating-point value into a fixed-point value or vice versa. the fpu of the v r 5500 conforms to ansi/ieee standard 754-1985, ?ieee2 floating-point operation standard?. 7.2 fpu registers the fpu has 32 general-purpose registers and 32 control registers. figure 7-1. registers of fpu (1/2) (a) floating-point general-purpose registers fgr0 fgr1 fgr2 fgr3 . . . fgr28 fgr29 fgr30 fgr31 fpr0 fpr2 . . . fpr28 fpr30 (lower) (higher) (lower) (higher) (lower) (higher) (lower) (higher) 31 0 floating-point general-purpose register (fgr) floating-point register (fpr) fgr0 fgr1 fgr2 fgr3 . . . fgr28 fgr29 fgr30 fgr31 63 0 floating-point general-purpose register (fgr) fpr0 fpr1 fpr2 fpr3 . . . fpr28 fpr29 fpr30 fpr31 floating-point register (fpr) (i) when fr bit = 0 (ii) when fr bit = 1
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 171 figure 7-1. registers of fpu (2/2) (b) floating-point control registers fcr0 (implementation/revision) reserved fcr25 (condition code) fcr26 (cause/flag) reserved fcr28 (enable/mode) reserved reserved fcr31 (control/status) 31 0 7.2.1 floating-point general-purpose registers (fgrs) the fpu has one set (32) of floating-point general-purpose registers (fgrs). the register length is 32 bits if the fr bit of the status register in cp0 is 0; it is 64 bits if the fr bit is 1. the cpu accesses an fgr by using a load, store, or transfer instruction. (1) if the fr bit of the status register is 0, the general-purpose registers are used as sixteen 64-bit registers (fprs) that hold single-precision or double-precision floating-point data. each fpr corresponds to a pair of fgrs each having a serial number, as shown in figure 7-1. (2) if the fr bit of the status register is 1, the general-purpose registers are used as thirty-two 64-bit registers (fprs) that hold single-precision or double-precision floating-point data. in this case, each fpr corresponds to one fgr as shown in figure 7-1.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 172 7.2.2 floating-point registers (fprs) if the fr bit of the status register in cp0 is 0, sixteen floating-point registers (fprs) can be used. if the fr bit is 1, thirty-two fprs can be used. an fpr is a 64-bit logical register and holds a floating-point value when a floating- point operation has been executed. physically, an fpr consists of one or two general-purpose registers (fgrs). if the fr bit of the status register is 0, the fpr consists of two 32-bit fgrs. if the fr bit is 1, the fpr consists of one 64-bit fgr. an fpr holds a single-precision or double-precision floating-point value. if the fr bit of the status register is 0, only an even number is used to specify an fpr. if the fr bit is 1, all the fpr register numbers are valid. if the fr bit is 0 when double-precision floating-point operation is executed, a pair of fgrs is used as a doubleword. if fpr0 is selected for a double-precision floating-point operation, for example, two fgrs adjoining each other, fgr0 and fgr1, are used. 7.2.3 floating-point control registers (fcrs) the fpu has 32 control registers. the v r 5500 can use the following five fcrs. ? the control/status register (fcr31) controls and monitors exceptions. this register also holds the result of a comparison operation and sets the rounding mode. ? the enable/mode register (fcr28), cause/flag register (fcr26), and condition code register (fcr25) respectively hold part of the area of fcr31, and set/hold the same contents. ? the implementation/revision register (fcr0) holds revision information on the fpu. table 7-1 shows the assignment of the fcrs. table 7-1. fcr fcr no. usage fcr0 implementation/revision of coprocessor fcr1 to fcr24 reserved fcr25 condition code fcr26 cause, flag fcr27 reserved fcr28 exception enable, rounding mode fcr29, fcr30 reserved fcr31 condition code, rounding mode, cause, exception enable, flag when fcr0, fcr25, fcr26, fcr28, or fcr31 is read by the cfc1 instruction, the contents of the register are transferred to the main processor after execution of all the instructions in the pipeline has been completed. each bit of fcr25, fcr26, fcr28, and fcr31 can be set or cleared by using the ctc1 instruction. data is written to these registers after execution of all the instructions in the pipeline has been completed.
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 173 7.3 floating-point control register 7.3.1 control/status register (fcr31) the control/status register (fcr31) is a read/write register, and holds control data and status data. this register controls the rounding mode and enables the occurrence of a floating-point exception. it also indicates information on an exception that has occurred in the instruction executed last, and information on exceptions that have been accumulated thus far without being treated as such because they are masked. figure 7-2 shows the configuration of fcr31. this figure shows the configuration of the cause, enable, and flag bits in fcr31. figure 7-2. fcr31 31 25 22 18 17 12 11 7 6 2 1 0 24 23 cc(7:1) fs cc0 0 cause e v z o u i enable v z o u i flag v z o u i rm figure 7-3. cause/enable/flag bits of fcr31 evzou i bit 17 16 15 14 13 12 vzou i 10 9 8 7 vzou i 5432 invalid operation division by zero overflow underflow inexact operation unimplemented operation cause bit enable bit flag bit bit 11 bit 6 ieee754 defines how an exception is detected during a floating-point operation, how flags are set, and how an exception handler is called if an exception occurs. the mips architecture implements this specification by using the cause, enable, and flag bits of the control/status register. the flag bit conforms to the exception status flag of ieee754, and the cause and enable bits conform to the exception handler of ieee754. each bit of fcr31 is explained next.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 174 (1) fs bit the fs bit enables flushing a value that cannot be normalized (denormalized number). if this bit is set and if the enable bit of the underflow exception and illegal exception is not set, the result of a denormalized number does not cause an unimplemented operation exception to occur, but rather is flushed. whether the denormalized number that has been flushed is 0 or the minimum normalized value depends on the rounding mode (refer to table 7-2 ). however, the madd.fmt, nmadd.fmt, msub.fmt, and nmsub.fmt instructions cause the unimplemented operation exception to occur, regardless of the value of the fs bit. table 7-2. flush value of denormalized number result rounding mode of result flushed result of denormalized number rn rz rp rm positive +0 +0 +2emin +0 negative ? 0 ? 0 ? 0 ? 2emin (2) cc bits bits 31 to 25 and 23 of fcr31 are cc (condition) bits. these bits store the result of a floating-point comparison instruction. if the result is true, they are set to 1; if the result is false, they are cleared to 0. the cc bits are not affected by any instruction other than the comparison instruction and ctc1 instruction. (3) cause bits bits 17 to 12 of fcr31 are cause bits and reflect the result of the instruction executed last. the cause bits are logical extensions of the cp0 cause register and indicate occurrence of an exception resulting from the last floating-point operation exception and its cause. if the corresponding enable bit is set, an exception occurs. if one instruction causes two or more exceptions, the corresponding bits are set. the cause bits are rewritten by a floating-point operation (except the load, store, and transfer instructions). the e bit is set to 1 if emulation of software is necessary; otherwise it will remain 0. the other bits are cleared to 0 if an ieee754 exception occurs, and remain set to 1 if the exception does not occur. if a floating-point operation exception occurs, the operation result is not stored, and only the cause bits are affected.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 175 (4) enable bits a floating-point operation exception occurs when both the cause bit and corresponding enable bit are set. the exception occurs as soon as a cause bit enabled for a floating-point operation has been set. the exception also occurs when the cause bit and enable bit are set by the ctc1 instruction. no enable bit corresponding to the unimplemented operation exception is available. when the unimplemented operation exception occurs, a floating-point operation exception always occurs. to restore from the floating-point operation exception, the cause bit that is enabled to cause the exception to occur must be cleared by software to prevent recurrence of the exception. therefore, a cause bit that has been set cannot be seen from the program in the user mode. when using information on the cause bit via a handler in the user mode, copy the value of the status register to another location. even if a cause bit is set, an exception does not occur if the corresponding enable bit is not set, and the default result defined by ieee754 is stored. in this case, the exception caused by the floating-point operation immediately before can be identified by reading the cause bit. (5) flag bits the flag bits accumulate and indicate exceptions that have occurred after reset. if an exception defined by ieee754 occurs, the flag bit is set to 1; otherwise it will remain unchanged. the flag bit is not cleared by a floating-point operation. however, it can be set/cleared by software if a new value is written to fcr31 by using the ctc1 instruction. if a floating-point operation exception occurs, the hardware does not set the flag bit. therefore, set the flag bit by software before processing is transferred to the user handler. (6) rounding mode control bits bits 1 and 0 of fcr31 are rm (rounding mode control) bits. these bits define the rounding mode the fpu uses for all the floating-point instructions. table 7-3. rounding mode control bits rm bit bit 1 bit 0 mnemonic description 0 0 rn rounds the result to the closest value that can be expressed. if the value is in between two values that can be expressed, the result is rounded toward the value whose least significant bit is 0. 0 1 rz rounds the result toward 0. the result is the closest to the value that does not exceed the absolute value of the result with infinite accuracy. 1 0 rp rounds the result toward + . the result is closest to a value greater than the accurate result with infinite accuracy. 1 1 rm rounds the result toward ? . the result is closest to a value less than the accurate result with infinite accuracy.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 176 7.3.2 enable/mode register (fcr28) the enable/mode register (fcr28) accesses only the enable, fs, and rounding mode control bits of fcr31. for details of each bit, refer to 7.3.1 control/status register (fcr31) . figure 7-4. fcr28 31 3 12 11 7 6 2 1 0 fs 0 enable v z o u i rm 0 7.3.3 cause/flag register (fcr26) the cause/flag register (fcr26) accesses only the cause and flag bits of fcr31. for details of each bit, refer to 7.3.1 control/status register (fcr31) . figure 7-5. fcr26 31 18 17 12 11 7 6 2 1 0 0 cause e v z o u i flag v z o u i 0 0 7.3.4 condition code register (fcr25) the condition code register (fcr25) accesses only the cc bits of fcr31. this register can treat the cc bit as eight consecutive bits. for details of the cc bits, refer to 7.3.1 control/status register (fcr31) . figure 7-6. fcr25 31 0 cc 7 8 0
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 177 7.3.5 implementation/revision register (fcr0) the implementation/revision register (fcr0) is a read-only register and holds the implementation identification number and implementation revision number of the fpu, status of the supported floating-point functions. this information can be used for revising the coprocessor, determining the performance level, and self-diagnosis. figure 7-7 shows the configuration of the implementation/revision register. figure 7-7. fcr0 31 0 rev 15 16 7 8 imp 0 s 17 19 18 20 d ps 3d 3d: support of three-dimensional graphics (0) ps: support of single-precision data pair (0) d: support of double-precision data pair (1) s: support of single-precision data (1) imp: implementation identification number (0x55) rev: implementation revision number 0: reserved. write 0 to these bits. zero is returned when these bits are read. bits 19 to 16 indicate which functions are implemented in the v r 5500. if a given function is not implemented, the corresponding bit is 0; if the function is implemented, the bit is 1. the implementation revision number is a value in the form of x.y, where y is the major revision number stored in bits 7 to 4 and x is the minor revision number stored in bits 3 to 0. the implementation revision number can be used to identify revision of the chip. however, modification of the chip is not always reflected on the revision number. conversely, modification of the revision number does not always reflect the actual modification of the chip. therefore, develop a program so that it does not depend upon the revision number of this register.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 178 7.4 data format 7.4.1 floating-point format the fpu supports 32-bit (single-precision) and 64-bit (double-precision) ieee754 floating-point operations. the single-precision floating-point format consists of a 24-bit signed mantissa (s + f) and an 8-bit exponent (e), as shown in figure 7-8. figure 7-8. single-precision floating-point format 30 31 s sign 0 22 23 e exponent f mantissa 18 23 the double-precision floating-point format consists of a 53-bit signed mantissa (s + f) and an 11-bit exponent (e), as shown in figure 7-9. figure 7-9. double-precision floating-point format 62 63 s sign 0 51 52 e exponent f mantissa 111 52 a numeric value in the floating-point format consists of the following three areas. ? sign bit: s ? exponent: e = e + bias value ? mantissa: f = .b1b2 ? b p ? 1 (value lower than the first place below the decimal point) the range of unbiased exponent e covers all integer values from e min to e max , two reserved values, e min ? 1 ( 0 or denormalized number), and e max + 1 ( or nan: not a number). a numeric value other than 0 is expressed in one format, depending on the single-precision and double-precision formats. the numeric value (v) expressed in this format can be calculated by the expression shown in table 7-4.
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 179 table 7-4. calculation expression of floating-point value type calculation expression nan (not a number) if e = e max + 1 and f 0, v is nan regardless of s. (infinite number) if e = e max + 1 and f = 0, v = ( ? 1)s normalized number if e min e e max , v = ( ? 1) s 2 e (1.f) denormalized number if e = e min ? 1 and f 0, v = ( ? 1) s 2 emin (0.f) 0 (zero) if e = e min ? 1 and f = 0, v = ( ? 1) s 0 ? nan (not a number) ieee754 defines a floating-point value called nan (not a number). because it is not a numeric value, it does not have a relationship of greater than or less than. if v is nan in all the floating-point formats, it may be either signalingnan or quietnan, depending on the value of the most significant bit of f. if the most significant bit of f is set, v is signalingnan; if the most significant bit is cleared, it is quietnan. table 7-5 shows the value of each parameter defined in the floating-point format. table 7-5. floating-point format and parameter value format parameter single precision double precision e max +127 +1023 e min ? 126 ? 1022 bias value of exponent +127 +1023 length of exponent (number of bits) 8 11 integer bit cannot be seen cannot be seen length of mantissa (number of bits) 24 53 length of format (number of bits) 32 64 table 7-6 shows the minimum value and maximum value that can be expressed in this floating-point format. table 7-6. maximum and minimum values of floating point type value minimum value of single-precision floating point 1.40129846e ? 45 minimum value of single-precision floating point (normal) 1.17549435e ? 38 maximum value of single-precision floating point 3.40282347e + 38 minimum value of double-precision floating point 4.9406564584124654e ? 324 minimum value of double-precision floating point (normal) 2.2250738585072014e ? 308 maximum value of double-precision floating point 1.7976931348623157e + 308
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 180 7.4.2 fixed-point format the value of a fixed point is held in the format of 2?s complement. operation instructions that handle data in the unsigned fixed-point format are not provided in the floating-point instruction set. figure 7-10 shows a 32-bit fixed- point format and figure 7-11 shows a 64-bit fixed-point format. figure 7-10. 32-bit fixed-point format 30 31 s sign 0 i integer 31 1 s: sign bit i: integer value (2 ? s complement) figure 7-11. 64-bit fixed-point format s sign i integer 63 62 0 163 s: sign bit i: integer value (2 ? s complement)
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 181 7.5 outline of fpu instruction set all the fpu instructions are 32 bits long and aligned at the word boundary. these instructions are classified as follows. ? load/store/transfer instructions that transfer data between the general-purpose register or control register of the fpu and the cpu or memory ? conversion instructions that convert the data format ? arithmetic operation instructions that execute an operation on a floating-point value in an fpu register ? comparison instructions that compares fpu registers and set the result to the cc bits of fcr31 and fcr25 ? fpu branch instructions that branch execution to a specified target if the specified coprocessor condition is satisfied fmt appended to the instruction opcode of an operation or comparison instruction indicates the data type. s indicates single-precision floating point, d indicates double-precision floating point, l indicates 64-bit fixed point, and w indicates 32-bit fixed point. for example, ? add.d ? indicates that the operand of the addition instruction is a double-precision floating-point value. if the fr bit of the status register in cp0 is 0, an odd-numbered register cannot be specified. for details of each instruction, refer to chapter 18 fpu instruction set .
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 182 7.5.1 floating-point load/store/transfer instructions (1) load/store between fpu and memory loading/storing between the fpu and memory is performed by the following instructions. ? lwc1, lwxc1, swc1, and swxc1 instructions, which access fgr in word (32-bit) units ? lcd1, ldxc1, luxc1, sdc1, sdxc1, and suxc1 instructions, which access fgr in doubleword (64-bit) units these load/store instructions are independent of the numeric value format, and format conversion is not executed. nor does the floating-point operation exception occur. (2) data transfer between fpu and cpu data is transferred between a general-purpose register of the fpu and the cpu by the mtc1, mfc1, dmtc1, or dmfc1 instruction. like the load/store instructions, these transfer instructions do not convert the numeric value format and the floating-point operation exception does not occur. the ctc1 and cfc1 instructions of the cpu instruction transfer data between a control register of the fpu and the cpu. (3) load delay and hardware interlock the register that is to be loaded can be used in the instruction immediately after a load instruction. in this case, however, interlocking occurs and a cycle is appended. to avoid interlocking, therefore, scheduling of the load delay slot is necessary. with the v r 5500, however, the load delay is eliminated, unless the pipeline is congested, because instructions are executed by an out-of-order mechanism. therefore, it seems that instructions were executed without delay. (4) aligning data all the load/store instructions except luxc1 and suxc1 reference the following aligned data. ? the access type for a word load/store instruction is always a word, and the lower 2 bits of the address must be 0. ? the access type for a doubleword load/store instruction is always a doubleword, and the lower 3 bits of the address must be 0. (5) byte arrangement regardless of the byte arrangement (endianness), an address is specified by the lowest byte address in an address area. in a big-endian system, the leftmost byte address is specified. in a little-endian system, the rightmost byte address is specified. table 7-7 lists the load/store/transfer instructions.
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 183 table 7-7. load/store/transfer instructions (1/2) instruction format and description load word to fpu lwc1 ft, offset (base) sign-extends and adds a 16-bit offset to the contents of cpu register base to generate an address. loads the contents of the word specified by the address to fpu general-purpose register ft . store word from fpu swc1 ft, offset (base) sign-extends and adds a 16-bit offset to the contents of cpu register base to generate an address. stores the contents of fpu general-purpose register ft in the memory position specified by the address. load doubleword to fpu ldc1 ft, offset (base) sign-extends and adds a 16-bit offset to the contents of cpu register base to generate an address. loads the contents of the doubleword specified by the address to fpu general-purpose registers ft and ft + 1 when fr = 0. when fr = 1, loads the contents of the doubleword to fpu general-purpose register ft . store doubleword from fpu sdc1 ft, offset (base) sign-extends and adds a 16-bit offset to the contents of cpu register base to generate an address. stores the contents of fpu general-purpose registers ft and ft + 1 in the memory location specified by the address when fr = 0. when fr = 1, stores the contents of fpu general-purpose register ft in the same memory location. instruction format and description load word indexed to fpu lwxc1 fd, index (base) adds the contents of cpu register base to cpu register index to generate an address. loads the contents of the word specified by the address to fpu general-purpose register fd . load doubleword indexed to fpu ldxc1 fd, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. loads the contents of the doubleword specified by the address to fpu general-purpose registers fd and fd + 1 when fr = 0, and to fpu general-purpose register fd when fr = 1. load doubleword indexed unaligned to fpu luxc1 fd, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. loads the contents of the doubleword specified by the address to fpu general-purpose registers fd and fd + 1 when fr = 0, and to fpu general-purpose register fd when fr = 1. instruction format and description store word indexed from fpu swxc1 fs, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. stores the contents of fpu general-purpose register fs in the memory location specified by the address. store doubleword indexed from fpu sdxc1 fs, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. stores the contents of fpu general-purpose registers fs and fs + 1 in the memory location specified by the address when fr = 0, and fpu general-purpose register fs in the same memory location when fr = 1. store doubleword indexed unaligned from fpu suxc1 fs, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. stores the contents of fpu general-purpose registers fs and fs + 1 in the memory location specified by the address when fr = 0, and fpu general-purpose register fs in the same memory location when fr = 1. op b ase f t offset cop1 b ase i n d ex function 0 fd cop1 b ase i n d ex function f s 0
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 184 table 7-7. load/store/transfer instructions (2/2) instruction format and description move word to fpu mtc1 rt, fs transfers the contents of cpu general-purpose register rt to fpu general-purpose register fs . move word from fpu mfc1 rt, fs transfers the contents of fpu general-purpose register fs to cpu general-purpose register rt . move control word to fpu ctc1 rt, fs transfers the contents of cpu general-purpose register rt to fpu control register fs . move control word from fpu cfc1 rt, fs transfers the contents of fpu control register fs to cpu general-purpose register rt . doubleword move to fpu dmtc1 rt, fs transfers the contents of cpu general-purpose register rt to fpu general-purpose register fs . doubleword move from fpu dmfc1 rt, fs transfers the contents of fpu general-purpose register fs to cpu general-purpose register rt . instruction format and description floating-point move conditional on fpu true movt.fmt fd, fs, cc transfers the contents of fpu register fs in the specified format ( fmt ) to fpu register fd if the cc bit is true. floating-point move conditional on fpu false movf.fmt fd, fs, cc transfers the contents of fpu register fs in the specified format ( fmt ) to fpu register fd if the cc bit is false. instruction format and description floating-point move conditional on zero movz.fmt fd, fs, rt transfers the contents of fpu register fs in the specified format ( fmt ) to fpu register fd if cpu register rt is 0. floating-point move conditional on not zero movn.fmt fd, fs, rt transfers the contents of fpu register fs in the specified format ( fmt ) to fpu register fd if cpu register rt is other than 0. cop1 su b rt f s 0 cop1 f mt cc function f s fd cop1 f mt rt function f s fd
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 185 7.5.2 conversion instructions the conversion instructions execute format conversion between single precision and double precision, or between fixed point and floating point. table 7-8 lists the conversion instructions. table 7-8. conversion instructions instruction format and description floating-point convert to single floating-point format cvt.s.fmt fd, fs converts the contents of fpu register fs from the specified format ( fmt ) into a single-precision floating- point format. stores the result rounded in accordance with the setting of fcr31 and fcr28 in fpu register fd . floating-point convert to double floating-point format cvt.d.fmt fd, fs converts the contents of fpu register fs from the specified format ( fmt ) into a double-precision floating- point format. stores the result rounded in accordance with the setting of fcr31 and fcr28 in fpu register fd . floating-point convert to long fixed-point format cvt.l.fmt fd, fs converts the contents of fpu register fs from the specified format ( fmt ) into a 64-bit fixed-point format. stores the result rounded in accordance with the setting of fcr31 and fcr28 in fpu register fd . floating-point convert to single fixed-point format cvt.w.fmt fd, fs converts the contents of fpu register fs from the specified format ( fmt ) into a 32-bit fixed-point format. stores the result rounded in accordance with the setting of fcr31 and fcr28 in fpu register fd . floating-point round to long fixed-point format round.l.fmt fd, fs rounds and converts the contents of fpu register fs from the specified format ( fmt ) to a value closest to a 64-bit fixed-point format. stores the result in fpu register fd . floating-point round to single fixed-point format round.w.fmt fd, fs rounds and converts the contents of fpu register fs from the specified format ( fmt ) to a value closest to a 32-bit fixed-point format. stores the result in fpu register fd . floating-point truncate to long fixed-point format trunc.l.fmt fd, fs rounds the contents of fpu register fs toward 0 and converts the contents from the specified format ( fmt ) into a 64-bit fixed-point format. stores the result in fpu register fd . floating-point truncate to single fixed-point format trunc.w.fmt fd, fs rounds the contents of fpu register fs toward 0 and converts the contents from the specified format ( fmt ) into a 32-bit fixed-point format. stores the result in fpu register fd . floating-point ceiling to long fixed-point format ceil.l.fmt fd, fs rounds the contents of fpu register fs toward + and converts the contents from the specified format ( fmt ) into a 64-bit fixed-point format. stores the result in fpu register fd . floating-point ceiling to single fixed-point format ceil.w.fmt fd, fs rounds the contents of fpu register fs toward + and converts the contents from the specified format ( fmt ) into a 32-bit fixed-point format. stores the result in fpu register fd . floating-point floor to long fixed-point format floor.l.fmt fd, fs rounds the contents of fpu register fs toward ? and converts the contents from the specified format ( fmt ) into a 64-bit fixed-point format. stores the result in fpu register fd . floating-point floor to single fixed-point format floor.w.fmt fd, fs rounds the contents of fpu register fs toward ? and converts the contents from the specified format ( fmt ) into a 32-bit fixed-point format. stores the result in fpu register fd . cop1 f mt 0 function f s fd
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 186 when converting a floating-point format into a fixed-point format, make sure that the result is a value in a range of 2 53 ? 1 to ? 2 53 . if the result cannot be correctly expressed because it exceeds the range of 2 53 ? 1 to ? 253 as a result of rounding the value of the source, an unimplemented operation exception occurs and the result of the operation is discarded. the instructions that cause the unimplemented operation exception under these conditions are listed below. ceil.l.s ceil.l.d cvt.l.s cvt.l.d floor.l.s floor.l.d round.l.s round.l.d trunc.l.s trunc.l.d an unimplemented operation exception may also occur when converting a fixed-point format into a floating-point format. for details, refer to 8.3.6 unimplemented operation exception (e) .
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 187 7.5.3 operation instructions the operation instructions execute an operation on a floating-point value in a register. table 7-9 lists the operation instructions. three-operand instructions execute addition, subtraction, multiplication, or division of floating-point values. two-operand instructions execute absolute value, transfer, square root, and arithmetic negation of a floating-point value. table 7-9. operation instructions (1/2) instruction format and description floating-point add add. fmt fd, fs, ft arithmetically adds the contents of fpu registers fs and ft in the specified format ( fmt ), and stores the rounded result in fpu register fd . floating-point subtract sub. fmt fd, fs, ft arithmetically subtracts the contents of fpu registers fs and ft in the specified format ( fmt ), and stores the rounded result in fpu register fd . floating-point multiply mul. fmt fd, fs, ft arithmetically multiplies the contents of fpu registers fs and ft in the specified format ( fmt ), and stores the rounded result in fpu register fd . floating-point divide div. fmt fd, fs, ft arithmetically divides the contents of fpu register fs by the contents of fpu register ft in the specified format ( fmt ), and stores the rounded result in fpu register fd . floating-point absolute value abs. fmt fd, fs calculates an arithmetic absolute value of the contents of fpu register fs in the specified format ( fmt ), and stores the result in fpu register fd . floating-point move mov. fmt fd, fs copies the contents of fpu register fs in the specified format ( fmt ) to fpu register fd . floating-point negate neg. fmt fd, fs calculates arithmetic negation of the contents of fpu register fs in the specified format ( fmt ), and stores the result in fpu register fd . floating-point square root sqrt. fmt fd, fs calculates an arithmetic positive square root of the contents of fpu register fs in the specified format ( fmt ), and stores the rounded result in fpu register fd . cop1 f mt f t function f s fd
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 188 table 7-9. operation instructions (2/2) instruction format and description floating-point multiply- add madd.fmt fd, fr, fs, ft multiplies the contents of fpu registers fs and ft in the specified format ( fmt ), and adds the result to the contents of fpu register fr in a specified format ( fmt ). then stores the rounded result in fpu register fd . floating-point multiply- subtract msub.fmt fd, fr, fs, ft multiplies the contents of fpu registers fs and ft in the specified format ( fmt ), and subtracts the contents of fpu register fr from the result in the specified format ( fmt ). then stores the rounded result in fpu register fd . floating-point negate multiply-add nmadd.fmt fd, fr, fs, ft multiplies the contents of fpu registers fs and ft in the specified format ( fmt ), and adds the result to the contents of fpu register fr in the specified format ( fmt ). rounds the result and calculates arithmetic negation, and then stores that result in fpu register fd . floating-point negate multiply-subtract nmsub.fmt fd, fr, fs, ft multiplies the contents of fpu registers fs and ft in the specified format ( fmt ), and subtracts the contents of fpu register fr from the result in the specified format ( fmt ). rounds the result and calculates arithmetic negation, and then stores that result in fpu register fd . instruction format and description floating-point reciprocal recip.fmt fd, fs calculates the approximate value of the inverse number of the contents of fpu register fs in the specified format, and stores the result in fpu register fd . floating-point reciprocal square root rsqrt.fmt fd, fs calculates the square root of the contents of fpu register fs and then the approximate value of the inverse number of that value in the specified format. then stores the result in fpu register fd . cop1x f r f t function f s fd fmt cop1 f mt 0 function f s fd
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 189 7.5.4 comparison instruction the comparison instruction (c.cond.fmt) converts the contents of two fpu registers ( fs and ft ) in the specified format ( fmt ) for comparison. the result is determined based on the comparison condition ( cond ) included in the code. table 7-10 lists the comparison instruction, and table 7-11 lists the conditions of the comparison instruction. table 7-10. comparison instruction instruction format and description floating-point compare c.cond.fmt fs, ft interprets the contents of fpu register fs and ft in the specified format ( fmt ), and arithmetically compares them. the result is identified by comparison and the specified condition ( cond ). the result of the comparison can be used for the fpu branch instructions of the cpu. table 7-11. conditions for comparison instruction nmemonic definition nmemonic definition f always false t always true un unordered or ordered eq equal neq not equal ueq unordered or equal olg ordered and less than or greater than olt ordered and less than uge unordered or greater than or equal to ult ordered or less than oge ordered and greater than or equal to ole ordered and less than or equal to ugt unordered or greater than ule unordered or less than or equal to ogt ordered and greater than sf signaling and false st signaling and true ngle not greater than, not less than, and not equal to gle greater than, less than, or equal to seq signaling and equal to sne signaling and not equal to ngl not greater than and not less than gl greater than or less than lt less than nlt not less than nge not greater than and not equal to ge greater than or equal to le less than or equal to nle not less than and not equal to ngt not greater than gt greater than cop1 f mt f t function f s 0
chapter 7 floating-point unit preliminary user?s manual u16044ej1v0um 190 7.5.5 fpu branch instructions table 7-12 lists the fpu branch instructions. these instructions can be used to test the result of the comparison instruction (c.cond.fmt). ?delay slot? in this table means the instruction immediately following a branch instruction. for details, refer to chapter 4 pipeline . table 7-12. fpu branch instructions instruction format and description branch on fpu true bc1t offset calculates the branch target address by adding the instruction address in the delay slot and a 16-bit offset (shifts the address 2 bits to the left and sign-extends it). if the fpu condition line is true, execution branches to the target address (delay of 1 instruction). branch on fpu false bc1f offset calculates the branch target address by adding the instruction address in the delay slot and a 16-bit offset (shifts the address 2 bits to the left and sign-extends it). if the fpu condition line is false, execution branches to the target address (delay of 1 instruction). branch on fpu true likely bc1tl offset calculates the branch target address by adding the instruction address in the delay slot and a 16-bit offset (shifts the address 2 bits to the left and sign-extends it). if the fpu condition line is true, execution branches to the target address (delay of 1 instruction). if a conditional branch does not take place, the instruction in the delay slot is invalid. branch on fpu false likely bc1fl offset calculates the branch target address by adding the instruction address in the delay slot and a 16-bit offset (shifts the address 2 bits to the left and sign-extends it). if the fpu condition line is false, execution branches to the target address (delay of 1 instruction). if a conditional branch does not take place, the instruction in the delay slot is invalid. 7.5.6 other instructions table 7-13. prefetch instruction instruction format and description prefetch indexed prefx hint, index (base) adds the contents of cpu register base to the contents of cpu register index to generate an address. how the data specified by the address is treated is specified by the hint area. table 7-14. conditional transfer instructions instruction format and description move conditional on fpu true movt rd, rs, cc transfers the contents of cpu register rs to cpu register rd if the cc bit is true. move conditional on fpu false movf rd, rs, cc transfers the contents of cpu register rs to cpu register rd if the cc bit is false. cop1 bc b r offset cop1 b ase i n d ex function hi nt 0 special rs cc funct r d 0
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 191 7.6 execution time of fpu instruction unlike the cpu, which executes almost all instructions in 1 cycle, the fpu instructions take a long time to execute. table 7-15 shows the minimum execution time of each floating-point instruction in the number of pcycles. this execution time is calculated on the assumption that the result of execution of each instruction is used by the instruction immediately after. table 7-15. number of execution cycles of floating-point instructions (1/2) number of pcycles (when executed singly/repeatedly) instruction single double word long word add.fmt 4/4 4/4 ?? sub.fmt 4/4 4/4 ?? mul.fmt 5/5 6/6 ?? madd.fmt 9/9 10/10 ?? msub.fmt 9/9 10/10 ?? nmadd.fmt 9/9 10/10 ?? nmsub.fmt 9/9 10/10 ?? div.fmt 30/30 59/59 ?? sqrt.fmt 30/30 59/59 ?? recip.fmt 30/30 59/59 ?? rsqrt.fmt 60/60 118/118 ?? abs.fmt 2/2 2/2 ?? neg.fmt 2/2 2/2 ?? round.w.fmt 6/6 6/6 ?? round.l.fmt 6/6 6/6 ?? trunc.w.fmt 6/6 6/6 ?? trunc.l.fmt 6/6 6/6 ?? ceil.w.fmt 6/6 6/6 ?? ceil.l.fmt 6/6 6/6 ?? floor.w.fmt 6/6 6/6 ?? floor.l.fmt 6/6 6/6 ?? cvt.d.fmt 2/2 ? 6/6 6/6 cvt.s.fmt ? 4/4 6/6 6/6 cvt.w.fmt 6/6 6/6 ?? cvt.l.fmt 6/6 6/6 ?? c.cond.fmt 2/2 2/2 ??
chapter 7 floating-point unit preliminary user ? s manual u16044ej1v0um 192 table 7-15. number of execution cycles of floating-point instructions (2/2) number of pcycles (when executed singly/repeatedly) instruction single double word long word bc1t 2/2 (hit), 6/6 (miss) 2/2 (hit), 6/6 (miss) ?? bc1f 2/2 (hit), 6/6 (miss) 2/2 (hit), 6/6 (miss) ?? bc1tl 2/2 (hit), 6/6 (miss) 2/2 (hit), 6/6 (miss) ?? bc1fl 2/2 (hit), 6/6 (miss) 2/2 (hit), 6/6 (miss) ?? lwc1 4/3 4/3 ?? swc1 na/1 na/1 ?? ldc1 4/3 4/3 ?? sdc1 na/1 na/1 ?? lwxc1 4/3 4/3 ?? swxc1 na/1 na/1 ?? ldxc1 4/3 4/3 ?? sdxc1 na/1 na/1 ?? luxc1 4/3 4/3 ?? suxc1 na/1 na/1 ?? mov.fmt 2/2 2/2 ?? movz.fmt 7/7 7/7 ?? movn.fmt 7/7 7/7 ?? movf.fmt 7/7 7/7 ?? movt.fmt 7/7 7/7 ?? mtc1 2/2 2/2 ?? mfc1 1/1 1/1 ?? dmtc1 2/2 2/2 ?? dmfc1 1/1 1/1 ?? ctc1 note 10/12 10/12 ?? cfc1 note 10/12 10/12 ?? note this instruction is executed serially. no other instructions are executed at the same time. remark na: under evaluation
preliminary user?s manual u16044ej1v0um 193 chapter 8 floating-point exceptions this chapter explains how the fpu processes floating-point exceptions. 8.1 types of exceptions a floating-point exception occurs if a floating-point operation or an operation result cannot be processed by the ordinary method. the fpu may perform either of the following operations if an exception occurs. ? when exceptions are enabled the fpu sets the cause bit of the control/status register (fcr31) or cause/flag register (fcr26) and transfers processing to an exception handler routine (software processing). ? when exceptions are disabled the fpu stores an appropriate value (default value) in the destination register and continues execution. the fpu supports the following five types of ieee754 exceptions by using the cause bit, enable bit, and flag bit (status flag). ? inexact operation (i) ? overflow (o) ? underflow (u) ? division-by-zero (z) ? invalid operation (v) as the sixth exception cause, the fpu has an unimplemented operation (e) that is used if a floating-point operation cannot be executed with the standard architecture of mips (including when the fpu cannot correctly process an exception). this exception must be processed by software. an e bit is not provided in the enable or flag bits. if this exception occurs, unimplemented exception processing is executed (if interrupts input by the fpu to the cpu are enabled). figure 8-1 shows the bits of fcr31 that are used to support exceptions. the same enable bits is also provided in fcr28, and the same cause and flag bits are also provided in fcr26.
chapter 8 floating-point exceptions preliminary user?s manual u16044ej1v0um 194 figure 8-1. cause/enable/flag bits of fcr31 evzou i bit 17 16 15 14 13 12 vzou i bit 11 10 9 8 7 vzou i bit 6 5 4 3 2 invalid operation division by zero overflow underflow inexact operation unimplemented operation cause bit enable bit flag bit the five exceptions of ieee754 (v, z, o, u, and i) are enabled by setting the corresponding bit. when an exception occurs, the corresponding cause bit is set. if the corresponding enable bit is set, the fpu generates an interrupt to the cpu, and starts exception processing. if occurrence of the exception is disabled, the cause bit and flag bit corresponding to that exception are set. 8.2 exception processing if a floating-point operation exception occurs, the cause register of cp0 indicates that the cause of the exception lies in the fpu. the code of the floating-point exception (fpe) is used, and the cause bits of fcr31 and fcr26 indicate the cause of the floating-point operation exception. these bits function as an extension of the cause register of cp0. 8.2.1 flag a flag bit is available for each ieee754 exception. the flag bit is set if occurrence of the corresponding exception is disabled and if the condition of the exception is detected. the flag bit can be set/reset by writing a new value to fcr31 or fcr26 using the ctc1 instruction. if an exception is disabled by the corresponding enable bit, the fpu performs predetermined processing. this processing gives a default value instead of the result of the floating-point operation. this default value is determined by the type of the exception. if an overflow or underflow exception occurs, the default value differs depending on the rounding mode at that time. table 8-1 shows the default values given by each ieee754 exception of the fpu.
chapter 8 floating-point exceptions preliminary user ? s manual u16044ej1v0um 195 table 8-1. default values of ieee754 exceptions in fpu area description rounding mode default value v invalid operation ? uses quiet not a number (q-nan). z division-by-zero ? uses correctly signed . ooverflow rn with sign of intermediate result rz maximum normalized number with sign of intermediate result rp negative overflow: maximum negative normalized number positive overflow: + rm positive overflow: maximum positive normalized number negative overflow: ? u underflow rn 0 with sign of intermediate result rz 0 with sign of intermediate result rp positive underflow: minimum positive normalized number negative underflow: 0 rm negative underflow: minimum negative normalized number positive underflow: 0 i inexact operation ? uses rounded result. the fpu internally detects nine types of statuses that may trigger an exception. when the fpu detects these abnormal statuses, an ieee754 exception or the unimplemented operation exception (e) occurs. table 8-2 shows the statuses that trigger exceptions, and a comparison of the contents of the corresponding cause bits of the fpu and the ieee754 standard. table 8-2. fpu internal result and flag status fpu internal result ieee754 exception enabled exception disabled remark inexact operation i i i result is not accurate. exponent overflow o, i note o, i o, i normalized exponent > e max division-by-zero z z z zero (exponent = e min ? 1, mantissa = 0) overflow during conversion v e e source is outside integer range signaling nan (s-nan) source v v v invalid operation v v v 0 0, etc. exponent underflow u e e normalized exponent < e min denormalized source none e e exponent = e min ? 1 and mantissa 0 q-nan none e e note ieee754 allows an inexact operation exception to occur in the case of an overflow only when the overflow exception is disabled, but the v r 5500 always allows an overflow exception and an inexact operation exception to occur in the case of an overflow.
chapter 8 floating-point exceptions preliminary user ? s manual u16044ej1v0um 196 8.3 details of exceptions this section explains the conditions under which each exception occurs and the action taken by the fpu. 8.3.1 inexact operation exception (i) the fpu generates an inexact operation exception in the following cases. ? if the accuracy of the rounded result drops ? if the rounded result overflows ? if the rounded result underflows and if an underflow exception and an inexact operation exception are disabled and the fs bit of fcr31 and fcr28 is set usually, the fpu checks the operands of an instruction before executing the instruction. based on the exponent value of the operand, the fpu judges whether an exception may occur as a result of executing this instruction. if an exception may occur, the fpu uses a stall when executing this instruction. however, the fpu cannot predict whether executing a certain instruction results in an illegal value. if the inexact operation exception is enabled, the fpu uses a stall for executing all instructions, and thus the execution time increases by 1 cycle. this substantially affects the performance. therefore, enable the inexact operation instruction only when it is necessary. (1) if exception is enabled the contents of the destination register are not changed, the contents of the source register are saved, and the inexact operation exception occurs. (2) if exception is not enabled if no other exception occurs, the rounded result or the result that underflows/overflows is stored in the destination register.
chapter 8 floating-point exceptions preliminary user ? s manual u16044ej1v0um 197 8.3.2 invalid operation exception (v) an invalid operation exception occurs if one of or both the operands are invalid. if the exception is not enabled, the result is not a number (q-nan). the invalid operations include the following operations. ? addition/subtraction: addition/subtraction between infinities (+ ) + ( ? ) or ( ? ) ? ( ? ) ? multiplication: 0 ? division: 0 0 or ? comparison of ? < ? or ? > ? with an unordered operand and without ? ? ? ? arithmetic operation with s-nan included in the operand. the transfer instruction (mov) is not treated as an arithmetic operation, but the absolute value (abs) and arithmetic negation (neg) are treated as arithmetic operations. ? comparison with s-nan as operand and conversion into floating point ? square root: if operand is less than 0 in addition to the above, an exception can be simulated by software if an invalid operation is performed on the specified source operand. examples of this operation include ieee754-specified functions that can be executed by software, such as the remainder mentioned below. ? remainder xremy if y is 0 or if x is infinity ? conversion of a floating-point value of infinity or nan that triggers overflow into a decimal number ? transcendental functions such as in( ? 5) and cos ? 1(3) (1) if exception is enabled the contents of the destination register are not changed, the contents of the source register are saved, and the inexact operation exception occurs. (2) if exception is not enabled if no other exception occurs, q-nan is stored in the destination register. 8.3.3 division-by-zero exception (z) a division-by-zero exception occurs if a finite number with a divisor of 0 and a dividend of other than 0 is used. this exception also occurs if an operation that produces signed infinity as the result, such as in(0), sec( /2), csc(0), and 0 ? 1, is performed. (1) if exception is enabled the contents of the destination register are not changed, the contents of the source register are saved, and the division-by-zero exception occurs. (2) if exception is not enabled if no other exception occurs, a correctly signed infinite number ( ) is stored in the destination register.
chapter 8 floating-point exceptions preliminary user ? s manual u16044ej1v0um 198 8.3.4 overflow exception (o) an overflow exception occurs if the exponent range is infinite and if the size of the result of the rounded floating point is greater than the maximum finite number in the destination format (an inexact operation exception occurs and the flag bit is set). (1) if exception is enabled the contents of the destination register are not changed, the contents of the source register are saved, and the overflow exception occurs. (2) if exception is not enabled if no other exception occurs, the default value that is determined by the rounding mode and the sign of the intermediate result is stored in the destination register (refer to table 8-1 default values of ieee754 exceptions in fpu ). 8.3.5 underflow exception (u) an underflow exception occurs in the following two cases. ? if the operation result is ? 2 emin to +2 emin (but other than 0) ? if the accuracy drops as a result of an operation between not normalized small numbers. ieee754 defines many methods for detecting an underflow. however, be sure to detect an underflow by the same method whatever processing may be performed. the following two methods may be used to detect an underflow. ? if the result calculated after rounding and with an infinite exponent range is other than 0 and within 2 emin ? if the result calculated before rounding and with an infinite exponent range and accuracy is other than 0 and within 2 emin the mips architecture detects an underflow after rounding the result. the following two methods may be used to detect a drop in accuracy. ? denormalized loss (if a given result and the result calculated when the exponent range is infinite differ) ? illegal result (if a given result and the result calculated when the exponent range and accuracy are infinite differ) the mips architecture detects a drop in accuracy as an illegal result. (1) if exception is enabled if the underflow exception/inexact operation exception is enabled or if the fs bit of fcr31 and fcr28 is not set, an unimplemented operation exception (e) occurs. at this time, the contents of the destination register are not changed. (2) if exception is not enabled if the underflow exception and inexact operation exception are disabled and if the fs bit of fcr31 and fcr28 is set, the default value determined by the rounding mode and the sign of the intermediate result is stored in the destination register (refer to table 8-1 default values of ieee754 exceptions in fpu ).
chapter 8 floating-point exceptions preliminary user?s manual u16044ej1v0um 199 8.3.6 unimplemented operation exception (e) the e bit is set and an exception occurs if an attempt is made to execute an instruction with an operation code reserved for future expansion or an invalid format code. the operand and the contents of the destination register are not changed. usually, the instruction is emulated by software. if an ieee754 exception occurs from an emulated operation, simulate that exception. the unimplemented operation exception also occurs in the following cases, in which an abnormal operand or abnormal result that cannot be correctly processed by hardware is detected. ? if the operand is a denormalized number (except a compare instruction) ? if the operand is a q-nan (except compare instruction) ? if the result is a denormalized number or underflows when the underflow/inexact operation exception is enabled or when the fs bit of fcr31 and fcr28 is not set ? if a reserved instruction is executed ? if an unimplemented format is used ? if a format whose operation is invalid is used (e.g., cvt.s.s) caution if the instruction is a format conversion or arithmetic operation instruction, the exception occurs only when the operand is a denormalized number or nan. the exception occurs even if the operand is a denormalized number or nan when a transfer instruction is executed. the v r 5500 also generates the unimplemented operation exception in the following cases. ? if the result of multiplication by the madd, msub, nmadd, or nmsub instruction is a denormalized number, underflows, or overflows ? if a mips iv floating-point instruction is executed when the mips iv instruction set is not enabled ? if the value of the result is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ?2 53 (0xffe0 0000 0000 0000) when the format is converted from a floating-point format to a 64-bit fixed-point format instruction: ceil.l.fmt, cvt.l.fmt, floor.l.fmt, round.l.fmt, trunc.l.fmt ? if the value of the result is outside the range of 2 31 ? 1 (0x7fff ffff) to ?2 31 (0x8000 0000) when the format is converted from a floating-point format to a 32-bit fixed-point format instruction: ceil.w.fmt, cvt.w.fmt, floor.w.fmt, round.w.fmt, trunc.w.fmt ? if the value of the source operand is outside the range of 2 55 ? 1 (0x007f ffff ffff ffff) to ?2 55 (0xff80 0000 0000 0000) when the format is converted from a 64-bit fixed-point format to a floating-point format instruction: cvt.d.fmt, cvt.s.fmt the unimplemented operation exception can be used in any way by the system. to maintain complete compatibility with ieee754, the unimplemented operation exception can be handled by software if it occurs. (1) if exception is enabled the contents of the destination register are not changed, the contents of the source register are saved, and the unimplemented operation exception occurs. (2) if exception is not enabled this exception cannot be disabled because there is no corresponding enable bit.
chapter 8 floating-point exceptions preliminary user ? s manual u16044ej1v0um 200 8.4 saving and restoring status the ldc1 or sdc1 instruction is executed for 16 doublewords note to save or restore the status of a floating-point register to or from memory. information on fcr31, fcr28, fcr26, and fcr25 is saved to or restored from a cpu register by the cfc1 or ctc1 instruction. usually, fcr31 is saved first and restored last. if the fpu is executing a floating-point instruction when fcr31, fcr28, fcr26, or fcr25 is read, the instruction may be completely executed or reported as an exception. because the architecture does not allow a pending instruction to cause an exception, if execution of the pending instruction cannot be completed, that instruction is transferred to an exception register (if any). information such as the type of the exception is stored in fcr31, fcr28, fcr26, or fcr25. when the status is restored, fcr31 indicates that an exception is pending. by writing a value of 0 to the cause bits of fcr31 or fcr26, all pending exceptions can be cleared, and resumption of the normal processing is enabled after the status of the floating-point register has been restored. the cause bits of fcr31 and fcr21 hold the result of only one instruction. the fpu checks the operand before executing an instruction to judge whether an exception may occur. if an exception may occur, the fpu executes this instruction by using a stall, so that two or more instructions (that may cause an exception) are not executed at the same time. note thirty-two doublewords if the fr bit of the status register in cp0 is set to 1 8.5 handler for ieee754 exceptions ieee754 recommends an exception handler that can store calculation results in the destination register regardless of which of the five standard exceptions occurs. the exception handler can identify the following by using the epc register to search for an instruction. ? occurrence of exception during instruction execution ? instruction under execution ? format of destination to obtain the correctly rounded result if an overflow, underflow (except the conversion instruction), or inexact operation exception occurs, the exception handler must have software that checks the source register and simulates instructions. if an invalid operation exception or division-by-zero exception occurs or if an overflow exception or underflow exception occurs during floating-point conversion, the exception handler must have software that can obtain the value of the operand by checking the source register of the instruction. ieee754 recommends that, if possible, the overflow and underflow exceptions have a priority higher than the inexact operation exception. this priority is set by software. the hardware sets the bits of both the overflow and the underflow exceptions, and inexact operation exception.
preliminary user?s manual u16044ej1v0um 201 chapter 9 initialization interface 9.1 functional outline the v r 5500 can be reset in three ways by using the coldreset# and reset# signals. ? power-on reset when the power supply has been stabilized after power application, all clocks are started. a power-on reset completely initializes the internal information of the processor without saving any status information. ? cold reset if the coldreset# signal is asserted while the processor is operating, all clocks are restarted and the test interface circuit is also initialized. a cold reset completely initializes the internal statuses of the processor without saving any status information. ? warm reset although the processor is restarted, the clock and test interface circuits are not affected. by using a warm reset, most of the internal statuses of the processor can be retained. however, the contents of registers are undefined. after reset, the processor serves as the bus master and drives the sysad bus. when adjusting a system reset with other system elements, the following must be noted: generally, the operation is undefined if a bus error occurs immediately before, during, and immediately after reset. in addition, reset initializes only a part of the internal status. therefore, completely initialize the processor by software. the statuses of the registers, control signals, and current are undefined from when power is applied to when reset is completed.
chapter 9 initialization interface preliminary user?s manual u16044ej1v0um 202 9.2 reset sequence the following two signals are used during reset. (1) coldreset# assert this signal to execute a power-on reset or cold reset. synchronize it with sysclock to deassert it. (2) reset# assert this signal to execute all reset operations. this signal does not have to be synchronized with the coldreset# signal when it is asserted. when only the reset# signal is asserted, a warm reset is started. to deassert this signal, synchronize it with sysclock. 9.2.1 power-on reset the sequence of a power-on reset is as follows. 1. confirm that stable v dd and v dd io are supplied within the specified voltage range. also confirm that the system clock of the specified frequency is stable and continues operating. 2. after power supply has been stabilized, assert the coldreset# signal for the duration of at least 64 k sysclock cycles. deassert the coldreset# signal in synchronization with sysclock. 3. the processor starts operating when the reset# signal is asserted after the coldreset# signal has been deasserted. keep the reset# signal active for the duration of at least 16 sysclock cycles after the coldreset# signal has been deasserted. deassert the reset# signal in synchronization with sysclock. the status of the initialization signal (refer to 9.3 ) is latched 1 sysclock cycle after the coldreset# signal has been deasserted. set the input level of the initialization signal before starting a power-on reset. keep the level from changing during operation. at reset, the processor serves as the bus master and drives the sysad bus. when the reset# signal is deasserted, the processor branches to the reset exception vector and starts execution of the reset exception handler. figure 9-1 shows the timing of a power-on reset.
chapter 9 initialization interface preliminary user?s manual u16044ej1v0um 203 figure 9-1. power-on reset timing v dd io v dd 64 k sysclock sysclock (input) 100 ms coldreset# (input) reset# (input) t ds 16 sysclock t ds 1.425 v 3.135 v 9.2.2 cold reset the sequence of a cold reset is the same as that of a power-on reset except that the power supply must be stabilized before the reset signal is asserted. figure 9-2 shows the timing of a cold reset. figure 9-2. cold reset timing v dd io 64 k sysclock sysclock (input) coldreset# (input) reset# (input) t ds 16 sysclock t ds t ds t ds h v dd h
chapter 9 initialization interface preliminary user?s manual u16044ej1v0um 204 9.2.3 warm reset a warm reset is started if the reset# signal is asserted in synchronization with sysclock. keep the reset# signal active for the duration of at least 16 sysclock cycles before deasserting it in synchronization with sysclock. a warm reset causes the processor to generate a soft reset exception. because a warm reset is started as soon as the reset# signal has been asserted, multiple-cycle operations such as processing of a cache miss and floating-point instructions are stopped, and the data and results may be lost. at reset, the processor serves as the bus master and drives the sysad bus. when executing a warm reset while a sysad bus transaction is in progress, also reset the external agent so that a conflict does not occur on the sysad bus. when the reset# signal is deasserted, the processor branches to the reset exception vector and starts executing the soft reset exception handler. figure 9-3 shows the timing of a warm reset. figure 9-3. warm reset timing h v dd io 16 sysclock sysclock (input) coldreset# (input) reset# (input) t ds t ds h h v dd 9.2.4 processor status at reset after a power-on reset, cold reset, and warm reset, all the internal statuses of the processor are reset and the processor starts program execution from the reset vector. the internal settings of the processor are retained after a warm reset has been executed. however, the status of the cache may be retained or not depending on whether processing of a cache miss has been aborted by resetting the processor. in addition, because the v r 5500 has a non-blocking structure, updating registers is canceled if execution of a load instruction is not complete when a reset is executed. the branch history table is initialized by a power-on reset and cold reset. the statuses of the registers, control signals, and current are undefined from when power is applied to when reset is completed.
chapter 9 initialization interface preliminary user?s manual u16044ej1v0um 205 9.3 initialization signals the v r 5500 has eight types of input signals that are sampled during initialization. these signals are used to set the division ratio of the clock, the byte configuration of memory, and the protocol of the system interface. set the level of these signals before starting a power-on reset. keep the level unchanged during operation. (1) divmode(2:0) these signals specify the division ratio of the internal processor clock (pclock) and external system clock (sysclock). eight types of division ratios can be set: 2, 2.5, 3, 3.5, 4, 4.5, 5, and 5.5. (2) bigendian this signal specifies the byte order used by the processor during operation. when it is high, big endian is specified; when it is low, little endian is specified. (3) busmode this signal specifies the bus width of the system interface. when this signal is high, the bus width is 64 bits; when it is low, the bus width is 32 bits. (4) tintsel this signal specifies the interrupt source allocated to the ip7 bit of the cause register. when it is high, the timer interrupt is selected, and an interrupt request executed by asserting the int5# pin or an external write request (sysad5) is ignored. when this signal is low, the interrupt request executed by the int5# pin or an external write request (sysad5) is selected, and the timer interrupt request is ignored. (5) disdvalido# this signal specifies the operation of the validout# signal. when this signal is low, the validout# signal is asserted only during the address issuance cycle; when it is low, the validout# signal is asserted even if address issuance is stalled due to ready control. (6) dwbtrans# this signal specifies expansion of the data transfer size when the system interface is 32 bits wide. if this signal is low, doubleword block transfer is enabled; it is disabled when this signal is high. (7) o3return# this signal specifies the protocol of the system interface. when it is low, the out-of-order return mode is specified; when it is high, the normal mode is specified. (8) drvcon this signal specifies the impedance control level of the output driver. when it is high, the level is weak; when it is low, the level is normal. it is recommended to set this signal to the low level (normal) with the v r 5500.
preliminary user?s manual u16044ej1v0um 206 chapter 10 clock interface this chapter explains the clock interface used in the v r 5500. 10.1 term definitions this manual uses the following terms when describing signals. ?rising edge? indicates the point of transition from low level to high level. ?falling edge? indicates the point of transition from high level to low level. ?clock-q delay? indicates the time required between when a signal inputs data to a device (clock) and when it outputs data from a device (q). figures 10-1 and 10-2 illustrate the meanings of these terms. figure 10-1. signal?s transition points 1 clock cycle 12 34 point of transition from high level to low level point of transition from low level to high level figure 10-2. clock-q delay q data input clock input clock-q delay data output
chapter 10 clock interface preliminary user?s manual u16044ej1v0um 207 10.2 basic system clock the v r 5500 uses the following clock signals. (1) sysclock the internal clock of the v r 5500 is generated based on sysclock. the interface with the external device also operates based on sysclock. (2) pclock the frequency ratio of pclock to sysclock can be selected from 2:1, 2.5:1, 3:1, 3.5:1, 4:1, 4.5:1, 5:1, and 5.5:1. this ratio is set by the signals input from the divmode(2:0) pins at reset. all the internal registers and latches use pclock. figure 10-3. when frequency ratio of sysclock to pclock is 1:2 12 34 cycle sysclock (input) pclock (internal) note (output) note (input) t ds t dh data data data data data data data t do t dm t do data note sysad(63:0), sysadc(7:0), syscmd(8:0), sysid(2:0)
chapter 10 clock interface preliminary user ? s manual u16044ej1v0um 208 10.2.1 synchronization with sysclock the processor data changes when t dm has elapsed after the rising edge of sysclock was detected, and is in the stable output status when t do has elapsed. this time is the sum of the maximum value of the clock-q delay of the processor output register and the maximum value of the delay when the data passes through the processor output driver. keep the data supplied to the processor stable for the duration of at least t ds before sysclock rises, and for the duration of t dh after the rising edge of sysclock, as shown in figure 10-3. 10.3 phase lock loop (pll) the processor has an internal pll circuit that is used to synchronize sysclock with pclock. because of the nature of the pll circuit, however, a clock synchronized with the frequency of sysclock can be generated in a limited range. the clock generated by using the pll circuit has specific uncertainty called jitter. the clock synchronized with sysclock by the pll circuit leads or lags behind sysclock, up to the maximum permissible value t j of jitter. to obtain accurate i/o timing parameters, therefore, add t j to t ds , t dh , and t do , and subtract t j from t dm .
preliminary user?s manual u16044ej1v0um 209 chapter 11 cache memory this chapter explains the cache memory: its place in the v r 5500 core memory organization, and the individual organization of the caches. 11.1 memory organization figure 11-1 shows the v r 5500 core system memory hierarchy. in the logical memory hierarchy, the caches are located between the cpu and main memory. they are designed to make the speedup of memory accesses transparent to the user. each functional block in figure 11-1 has the capacity to hold more data than the block above it. for example, main memory (physical memory) has a larger capacity than the caches. at the same time, each functional block takes longer to access than any block above it. for example, it takes longer to access data in the main memory than in the cpu on-chip registers. figure 11-1. logical hierarchy of memory register register cache v r 5500 cpu cache register main memory memory disc, cd-rom, tape, etc. peripheral devices faster access time increasing data capacity instruction cache data cache
chapter 11 cache memory preliminary user?s manual u16044ej1v0um 210 11.1.1 internal cache the v r 5500 has two caches. one of them is an instruction cache that holds instructions (program). the other is a data cache that holds data. when writing data to the data cache, translation of the store address and tag check are performed in the first phase, and then the data is written to ram in the next phase. figure 11-2 shows the relationship between the cache and memory. figure 11-2. internal cache and main memory main memory v r 5500 cache controller instruction cache data cache the features of the internal cache are as follows. ? index using virtual address ? physical address held by tag ? coherency with memory maintained by writeback or write through ? data management by two-way set associative method ? line lock can be specified ? cache line replacement by lru (least recently used) algorithm ? non-blocking structure (data cache only) the size of both the instruction and data caches of the v r 5500 is 32 kb.
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 211 11.2 configuration of cache this section explains the configuration of the internal data and instruction caches of the v r 5500. a cache consists of blocks called cache lines. a cache line is the minimum unit of information that can be fetched from the main memory to the cache, and is divided into a tag and data. the size of a cache line of both the instruction cache and data cache is 8 words (32 bytes). 11.2.1 configuration of instruction cache figure 11-3 shows the format of an 8-word (32-byte) instruction cache line. figure 11-3. format of instruction cache line 28 p itag 65 datap datap datap datap data data data data 64 63 0 r 1 state 2 l 3 4 0 131 130 129 66 197 196 195 132 263 262 261 198 tag data 27 itag: instruction tag l: lock bit (line lock status) state: status bit (line status) r: lru bit (way indication of candidate for replacement) p: parity bit (even parity for itag) datap: even parity for data (in word units) data: data of instruction cache
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 212 11.2.2 configuration of data cache figure 11-4 shows the format of an 8-word (32-byte) data cache line. figure 11-4. line format of data cache 28 p dtag 71 datap datap datap datap data data data data 64 63 0 1 state 2 l 3 4 0 143 136 135 72 215 208 207 144 287 280 279 216 tag data r 27 dtag: data tag l: lock bit (line lock status) state: status bit (line status) r: lru bit (way indication of candidate for replacement) p: parity bit (even parity for dtag) datap: even parity for data (in byte units) data: data of data cache 11.2.3 location of data cache the v r 5500 manages cache data by a two-way set associative method. this method divides the cache into two blocks of memory spaces (ways), and allocates two cache lines to the same index (refer to 11.3.5 accessing cache ).
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 213 11.3 cache operations as described earlier, caches provide temporary data storage, and they speed up memory accesses as seen by the user. in general, the processor accesses cache-resident instructions or data using the following procedure. (1) the processor attempts to access the instruction used next or data in the appropriate cache via the on-chip cache controller. (2) the cache controller checks to see if this instruction or data is present in the cache. if the instruction/data is present, the cpu retrieves it. this is called a cache hit. if the instruction/data is not present in the cache, the cache controller retrieves it from the main memory. this is called a cache miss. (3) when the required data or instruction is found, the cache controller passes it to the processor. the processor then continues operating. if a cache miss occurs, data is read from the main memory and one of the cache line is overwritten. this is called replacing a cache line. the v r 5500 manages the cache by a two-way set associative method, with two cache lines allocated to one index. if a cache miss occurs, which of the two lines is to be replaced is determined by the lru (least recently used) method. the way that is a candidate for replacement is indicated by the lru bit of the cache tag. the cache of the v r 5500 has a line lock function. if a cache line is locked when it is allocated, that line is not replaced even if a cache miss occurs. if a cache miss occurs while the line of both the ways is locked, however, one of the cache lines is unlocked in accordance with the lru bit. a cache line is locked or unlocked by the cache instruction. the setting status of locking is indicated by the lock bit of the cache tag. 11.3.1 coherency of cache data it is possible for the same data to be in two places simultaneously: the main memory and a cache. this coherency of this data is maintained by using the writeback or write-through method. with the v r 5500, the data cache management technique can be selected from writeback and write through, depending on the setting of the entrylo register or config register of cp0. the writeback method stores write data only in the cache, without writing it directly to the main memory note . some time later the data written to the cache is independently transferred to the main memory. in the v r 5500, a modified cache line is not written back to the memory until the cache line is to be replaced either in the course of satisfying a cache miss, or during the execution of a writeback cache instruction. with the write-through method, data written to the memory is also written to the cache simultaneously.
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 214 11.3.2 replacing instruction cache line if a miss occurs in the instruction cache, the cache line is replaced by using sub-block ordering. if a miss occurs in the instruction cache, the processor issues a memory read request. this means that the processor reads the cache line it requests from the main memory and writes it to the instruction cache. at this time, execution of the pipeline is resumed and the instruction cache is accessed again. 11.3.3 replacing data cache line if a miss occurs while data is being loaded from or stored in a cache, the cache line is replaced in compliance with the following rules. (1) data load miss if the cache line on which a miss has occurred is not dirty, that cache line is replaced with a new cache line. if the cache line is dirty, the cache line is first transferred to the write transaction buffer. then the cache line on which a miss occurred is replaced with a new cache line, and the data transferred to the write transaction buffer is written to memory. (2) data store miss (a) with writeback cache if the cache line on which a miss has occurred is not dirty, that cache line is replaced by store data merged with a new cache line. if the cache line is dirty, that cache line is first transferred to the write transaction buffer. then store data merged with a new cache line is written to the cache, and the data transferred to the write transaction buffer is written to memory. (b) with write-through cache if the cache line on which a miss has occurred is not dirty, that cache line and memory contents are replaced by store data merged with a new cache line. if the cache line is dirty, that cache line is first transferred to the write transaction buffer. then store data merged with a new cache line is written to the cache and memory.
chapter 11 cache memory preliminary user?s manual u16044ej1v0um 215 11.3.4 speculative replacement of data cache line the v r 5500 adds an unguarded attribute to the algorithm of the data cache. this attribute can be selected according to the setting of the entrylo register or config register of cp0, when the data cache is used (refer to chapter 5 memory management system ). the v r 5500 speculatively executes instructions by using branch prediction and an out-of-order mechanism. if a data load miss or data store miss occurs as a result of speculative execution of an instruction, the refill buffer once holds data to replace cache lines. if the conventional algorithm is selected for the data cache, replacement is not started until this instruction is committed, even if the refill buffer becomes full. by contrast, replacement can be started even before this instruction is committed if the unguarded attribute is selected. speculative replacement like this cannot be stopped once it has been started, regardless of whether its result is necessary or not. caution make sure that the following conditions are satisfied in the area where the unguarded attribute is specified. ? ? ? ? the os uses the virtual address space and all spaces are contiguous. ? ? ? ? if i/o is connected, a device whose status is not changed even if read must be used. if the address space is not contiguous, the result cannot be discarded when a load instruction is speculatively executed because a bus error exception occurs, and the system hangs up. if an i/o whose status may be changed when read is connected, the result cannot be discarded because the status on the i/o side is changed when a load instruction is speculatively executed. remarks 1. speculative processing using the unguarded attribute is only executed for the data cache. 2. of the accesses to the area of the unguarded attribute, a read request is speculatively output from the system interface before the instruction is committed, but a write request is output after the instruction has been committed. by contrast, if an access is made to the uncached area, a read request is also output to the system interface after the instruction has been committed.
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 216 11.3.5 accessing cache the cache instruction is used to change the status of the cache line or to write back cache data (for details, refer to chapter 17 cpu instruction set ). part of the virtual address (va) is used to index the instruction cache and data cache. because the cache size of the v r 5500 is 32 kb and has a two-way set, the most significant bit is va13. in addition, because the line size is 8 words (32 bytes), the least significant bit is va5. the way to be accessed is specified by the lru method for hit, fill, and fetch_and_lock operations, and by va0 for other operations. figure 11-5 shows the relationship between index and data output of the cache. figure 11-5. index and data output of cache tag line p tag way 0 data line data 64 datap 8 l r state internal address bus tag line way 1 data line va(13:5) va0 internal data bus p tag data 64 datap 8 l r state
chapter 11 cache memory preliminary user ? s manual u16044ej1v0um 217 11.4 status of cache the cache line may be in the following three states, which indicate the validity of data and coherency with the main memory. the status of the cache line is undefined after reset. initialize it by software. (1) instruction cache the instruction cache may be in either of the following two states. ? invalid: state in which the cache line does not have valid information. a cache line in this state cannot be used. set all the cache lines after a warm reset to invalid by software. a cache line not in the invalid status is assumed to have valid information. neither a cold reset nor a warm reset makes the cache status invalid. the cache is invalidated by software. ? clean: state in which the cache line has valid information that has been fetched from the main memory. it can be specified by software whether the cache line is locked or not. (2) data cache the data cache may be in any of the following three states. ? invalid: state in which the cache line does not have valid information. the cache line in this state cannot be used. set all the cache lines after a warm reset to invalid by software. a cache line not in the invalid status is assumed to have valid information. neither a cold reset nor a warm reset makes the cache status invalid. the cache is invalidated by software. ? clean: state in which the cache line has valid information that has not been changed after being fetched from the main memory. it can be specified by software whether the cache line is locked or not. ? dirty: state in which the cache line has valid information that has been changed after being loaded from the main memory. it can be specified by software whether the cache line is locked or not. a cache line in the clean or dirty status may be changed when the processor executes a certain type of cache instruction operation. for the operations of the cache instruction, refer to chapter 17 cpu instruction set . 11.5 manipulating cache by external agent the v r 5500 does not allow an external agent to check or manipulate the statuses and contents of either of the caches.
preliminary user?s manual u16044ej1v0um 218 chapter 12 overview of system interface the processor uses the system interface to access the external resources necessary for processing a cache miss and in the uncached area, and the external agent uses the system interface to access the internal resources of the processor. the system interface of the v r 5500 has several mode, including a mode in which another read request can be issued even if the first read operation is not complete and a read response can be separated and returned, and a mode that is compatible with the v r 5000. these modes can be selected by a combination of the levels input to the initialization pins at reset. this chapter explains the bus modes and basic operations of the system interface of the v r 5500. 12.1 definition of terms the following terms are used in chapters 13, 14, and 15. ? external agent a device connected to the processor via the system interface which processes requests issued by the processor ? system event an event that is generated in the processor and requests access to the external resources. for example, the following events are included. ? occurrence of a miss in the instruction cache when an instruction is fetched ? occurrence of a miss in the data cache when a load/store instruction is executed ? execution of a load/store instruction to the uncached area. ? sequence requests successively generated by the processor to process a system event ? protocol signal transition in each cycle of the system interface pins by which the processor or external agent issues requests ? syntax definition of the bit pattern of a code bus such as a command bus
chapter 12 overview of system interface preliminary user?s manual u16044ej1v0um 219 12.2 bus modes the v r 5500 has the following five types of bus modes. for details of the operation, refer to the corresponding chapter. ? 64-bit r5000 mode refer to chapter 13 system interface (64-bit bus mode) . ? 64-bit out-of-order return mode refer to chapter 15 system interface (out-of-order return mode) . ? 32-bit r5000 mode (compatible with pmc-sierra?s rm523x) refer to chapter 14 system interface (32-bit bus mode) . ? 32-bit v r 5432 native mode refer to chapter 14 system interface (32-bit bus mode) . ? 32-bit out-of-order return mode refer to chapter 15 system interface (out-of-order return mode) . the bus modes other than the out-of-order return mode are collectively called the normal mode. these modes are selected by using the busmode, o3return#, dwbtrans#, and disdvalido# signals at reset. the figure below shows the relationship between the setting of each signal and the mode to be selected. figure 12-1. bus modes of v r 5500 v r 5500 bus mode 64-bit bus mode v r 5432 native mode busmode = h o3return# = l o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = l o3return# = h, dwbtrans# = l, disdvalido# = l busmode = l 32-bit bus mode out-of-order return mode r5000 mode out-of-order return mode r5000 mode (compatible with rm523x) remarks 1. h: high level, l: low level 2. when the o3return# signal is low, the dwbtrans# and disdvalido# signals can be set to any level, but keep the level from changing during operation.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 220 12.3 outline of system interface 12.3.1 interface bus the sysad bus (address/data bus) and syscmd bus (command bus) are the main communication buses of the system interface. because the both the buses are bidirectional buses, they can be driven by a processor that issues processor requests or an external device that issues external requests (for details, refer to 12.4.4 processor request and external request ). a request that passes through the system interface consists of the following. ? address ? response data to read request or write data to write request ? command specifying type of request/data figure 12-2 shows the interface bus in the 64-bit bus mode, and figure 12-3 shows the interface bus in the 32-bit bus mode. figure 12-2. system interface bus (64-bit bus mode) syscmd(8:0) v r 5500 external agent sysad(63:0) figure 12-3. system interface bus (32-bit bus mode) syscmd(8:0) v r 5500 external agent sysad(31:0)
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 221 12.3.2 address cycle and data cycle a cycle in which a valid address is on the sysad bus is called an address cycle. a cycle in which valid data is on the sysad bus is called a data cycle. the v r 5500 uses the validout# signal to indicate that the address/data output to the system bus is valid. the external agent uses the validin# signal to indicate that the address/data output to the system bus is valid. the syscmd bus identifies the contents of the sysad bus cycle in a valid cycle. the most significant bit of the syscmd bus always indicates whether the current cycle is an address cycle or a data cycle. the syscmd bus indicates the following contents when the validout# or validin# signal is active. ? in an address cycle (syscmd8 = 0), syscmd(7:0) on the syscmd bus is a system interface command. ? in a data cycle (syscmd8 = 1), syscmd(7:0) on the syscmd bus is a data identifier. for details of the command and data identifier codes, refer to the descriptions on system interface commands and data identifiers in chapters 13, 14, and 15. 12.3.3 issuance cycle (1) processor request the processor issues two types of requests: a processor read request and a processor write request. the issuance cycle of the processor read request is determined by the status of the rdrdy# signal, and that of the processor write request is determined by the status of the wrrdy# signal. the issuance cycle is a cycle that is valid in the address cycle of each processor request. only one issuance cycle exists per processor request. to define the issuance cycle of an address cycle, assert the rdy#/wrrdy# signal on the external agent side up to two cycles before the address cycle of a processor read/write request, as shown in figure 12-4. to set an address cycle as the issuance cycle, do not deassert the rdrdy#/wrrdy# signal until that address cycle is started. figure 12-4. status of rdrdy#/wrrdy# signal of processor request 123456 syscycle sysclock (internal) sysad(63:0) (i/o) rdrdy#/wrrdy# (input) addr issuance cycle
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 222 (2) processor request and external request the processor releases the system interface to the slave status and receives an external request in response to the extrqst# signal from the external agent even when it is about to issue a processor request. if issuance of a processor request conflicts with issuance of an external request, the processor takes either of the following actions. ? completes issuance of the processor request before receiving the external request. ? releases the system interface to the slave status without completing issuance of the processor request. in the latter case, the processor issues the processor request after the external request has been completed (if the processor request is still necessary). 12.3.4 handshake signal the processor manages the flow of requests by using the following seven control signals. (1) rdrdy# and wrrdy# signals the external agent uses these signals to indicate whether it is ready to receive a new read transaction or a new write transaction. (2) extrqst#, release#, and preq# signals these signals are used to control transfer between the sysad bus and syscmd bus. the extrqst# signal is used by the external agent to indicate that it needs the right to control the interface. the release# signal is asserted by the processor when the processor grants the external agent the right to control the system interface. the preq# signal is used by the processor to indicate that it needs the right to control the interface. (3) validout# and validin# signal the processor uses the validout# signal and the external agent uses the validin# signal to indicate valid command/data on the syscmd or sysad bus.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 223 12.3.5 system interface bus data the data shown in table 12-1 is driven on the sysad and syscmd buses. the symbols in this table are used in the timing charts shown in the latter part of this chapter. table 12-1. system interface bus data range symbol meaning common unsd unused sysad(63:0) addr physical address data (element n + 1 of) data syscmd(8:0) cmd unspecific system interface command read read request command of processor or external agent write write request command of processor or external agent sinull external null request command for releasing system interface neod data identifier of last data element ndata data identifier of data element other than last
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 224 12.4 system interface protocol figure 12-5 shows an operation between registers that is performed via the system interface. the output signal of the processor is directly output from an output register and changes at the rising edge of sysclock. the signal input to the processor is directly latched to an input register at the rising edge of sysclock. figure 12-5. operation of system interface between registers input data v r 5500 sysclock output data input latch output latch 12.4.1 master status and slave status the system interface is in the master status while the v r 5500 is driving the sysad bus or syscmd bus. while the external agent is driving these buses, the system interface is in the slave status. in the master status, the processor always asserts the validout# signal if the sysad bus and syscmd bus are valid. in the slave status, always assert the validin# signal of the external agent if the sysad bus and syscmd bus are valid. the default bus master of the system interface is the processor. the external agent serves as the master of the system interface after the result of external arbitration has been obtained or it has issued a processor read request. the external agent returns the right to control the bus to the processor when the external request has been completed. the system interface remains in the master status unless either of the following occurs. ? the external agent requests and is granted the right to control the system interface (external arbitration). ? the processor issues a read request (compelled transition to slave status). these two cases are explained below.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 225 12.4.2 external arbitration the system interface must be in the slave status when the external agent issues an external request via the system interface. so that the system interface changes its status from master to slave, the processor performs arbitration by using the handshake signals of the system interface, extrqst# and release#, in the following procedure. <1> the external agent asserts the extrqst# signal to transmit a request to issue an external request to the processor. <2> when the processor is ready to receive the external request, it asserts the release# signal to change the status of the system interface from master to slave, and releases the system interface. <3> the system interface returns to the master status as soon as the external request has been issued. 12.4.3 uncompelled transition to slave status uncompelled transition of the system interface to the slave status is performed by the processor, and the system interface changes its status from master to slave when a processor read request is held pending. the release# signal is automatically asserted when a read request is issued. uncompelled transition to the slave status takes place in the cycle next to that of the processor read request. if an external request is issued after uncompelled transition to the slave status, the system interface returns to the master status. if there is a pending processor read request or if the external agent issues another external request, the processor asserts the release# signal for one cycle, and puts the system interface in the uncompelled slaved status. the external agent should confirm that the processor has put the system interface in the uncompelled slave status, and start driving the syscmd and sysad buses. while the system interface is in the slave status, the external agent can start an external request without arbitrating the system interface, i.e., without asserting the extrqst# signal. if the extrqst# signal is active when the external request is completed, the system interface automatically returns to the master status.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 226 12.4.4 processor requests and external requests there are two types of requests: processor requests and external requests. when a system event occurs, the processor issues a request via the system interface and accesses the external resources needed to process the event. accordingly, the system interface should be connected to the external agent that is used to control access to system resources. to request access to the processor ? s internal resources, the external agent issues an external request. processor requests include the following. ? read request: supplies the read address to the external agent ? write request: supplies the write address and either single data or block data to the external agent external requests include the following. ? write request: supplies an address and word data to be written to the processor resources ? null request: returns the system interface to the master status without affecting the processor these system events and requests are illustrated in figure 12-6 below. figure 12-6. requests and system events v r 5500 processor requests read write external agent external requests write null system events load miss store miss store hit load/store to uncached area accelerated store to uncached area instruction fetch from uncached area fetch miss
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 227 12.5 processor requests a processor request is a request for access to external resources via the system interface. processor requests include read requests and write requests. (1) summary of requests a read request is a request for data of a block, a doubleword, an unaligned doubleword, a word, or an unaligned word to be retrieved from the main memory or other system resources. a write request is a request which provides data of a block, a doubleword, an unaligned doubleword, a word, or an unaligned word to be written to the main memory or other system resources. (2) issuing requests the processor issues requests using a completely sequential method. this means that the processor handles only one pending request at a time. for example, after the processor issues a read request it waits for a read response before issuing the next request (except for the out-of-order return mode). the processor issues write requests only when there are no pending read requests. (3) control of requests the rdrdy# and wrrdy# signals, which are input signals for the processor, are used by the external agent to control the flow of processor requests. the rdrdy# signal controls the flow of processor read requests, and the wrrdy# signal controls the flow of processor write requests. figure 12-7 shows the sequence of processor request cycles. figure 12-7. flow of processor requests v r 5500 external agent <2> processor issues read or write request <1> by setting rdrdy# and wrrdy# signals as active, the external system controls acknowledgement
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 228 12.5.1 processor read request once the processor has issued a read request, the external agent should access the specified resource and return the request data. a processor read request can be separated from the response data of the external agent. in other words, the external agent can start an unrelated external request before returning response data in response to a processor read request. a processor read request ends when the last word of the response data has been received from the external agent. the response data ? s data identifier may indicate whether or not any errors exist in the response data. this enables the processor to generate a bus error exception. in the v r 5500, the external agent must be able to receive a new processor read request at any time if the following condition is satisfied. ? the rdrdy# signal is active at least two cycles before issuance of the address cycle. in the normal mode, the external agent must be able to receive a new processor read request at any time if the following condition is satisfied. ? there is currently no pending processor read request. in the out-of-order return mode, up to five read requests can be held pending. 12.5.2 processor write request once the processor has issued a write request, the specified resource is accessed and the specified data is written. a processor write request ends when the last word of the data has been sent to the external agent. the write requests of the v r 5500 support v r 4000-compatible, write re-issuance, and pipeline write timing modes. the external agent must be able to receive a new processor write request at any time if the following two conditions are satisfied. ? there is currently no pending processor read request. ? the wrrdy# signal is active at least two cycles before issuance of the address cycle and conforms to the requirements of the timing mode set by the config register. in the out-of-order return mode, a write request may be issued after a read request.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 229 12.6 external requests external requests include write requests and null requests. (1) outline of request a write request supplies data to be written to the internal resources (interrupt register) of the processor. a null request returns the system interface to the master status without affecting the processor. (2) controlling requests as shown in figure 12-8, the processor controls the flow of external requests via the arbitration signals extrqst# and release#. the external agent cannot issue an external request unless it is granted the right to control the system interface. the external agent acquires the right to control the system interface by asserting the extrsqt# signal and waiting until the processor asserts the release# signal for the duration of 1 cycle. when the external agent issues an external request, the right to control the system interface is returned to the processor. figure 12-8. flow of external request <1> <2> v r 5500 external agent right of control is returned to processor. external system requests right of control by asserting extrqst# signal. <4> external system issues external request. processor grants right of control by asserting release# signal <3> the right to control the system interface is always returned to the processor when the validin# signal has been asserted after an external request was issued. the processor does not acknowledge the subsequent external requests until it completes the current request. (3) issuing request if there is no pending processor request, the processor determines whether it receives an external request or issues a new processor request, depending on its internal status. the processor can issue a new processor request even while the external agent is requesting access to the system interface. the external agent asserts the extrqst# signal to indicate that it wants to start an external request. in response, the processor asserts the release# signal to release the right to control the system interface. the processor can acknowledge an external request in the following cases. ? when the processor has completed the processor request under execution ? when the extrqst# signal is input to the processor one or more cycles before the rdrdy#/wrrdy# signal is asserted while the processor is waiting for assertion of the rdrdy#/wrrdy signal to issue a processor read/write request ? when the processor puts the system interface in the uncompelled slave status and waits for a response to a read request (the external agent can issue an external request before supplying the read response data)
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 230 12.6.1 external write request when the external agent issues a write request, it accesses a specified external resource and writes data to it. the external write request is completed when word data has been transferred to the processor. the only resource of the processor that can be accessed by an external write request is the interrupt register. 12.6.2 read response a read response is used by the external agent to return data in response to a processor read request. unlike the other external requests, a read response does not execute system interface arbitration (requesting the right to control the system interface by using the extrqst# signal). therefore, a read response is treated as something different from an external request. the data identifier of response data can also indicate that the response data contains an error, so that the processor can generate a bus error exception. figure 12-9. read response v r 5500 external agent <1> read request <2> read response
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 231 12.7 event processing this section explains the following system events. ? load miss ? store miss ? store hit ? load/store in uncached area ? accelerated store in uncached area ? instruction fetch from uncached area ? fetch miss 12.7.1 load miss if the processor misses the data cache when loading data, it issues a read request to obtain a cache line. the external agent returns data as a read response. if the cache data to be replaced is dirty, the processor writes back this data to memory. after writing back the data, the processor requests the external agent for clean data, and performs a write operation to the cache. the operation when a load miss occurs is shown in table 12-2. table 12-2. operation in case of load miss page attribute status of data cache line to be replaced clean/invalid dirty cache br br/bw br: processor block read request bw: processor block write request if it is necessary to write back the current cache line, the processor issues a block write request to save the dirty cache line to memory.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 232 12.7.2 store miss if a processor store miss occurs in the cache, the processor requests the external agent for the cache line that holds the target store location. table 12-3 shows the operation in case of a store miss. table 12-3. operation in case of store miss page attribute status of data cache line to be replaced clean/invalid dirty writeback br br/bw write through br/w ? br: processor block read request bw: processor block write request w: processor non-block write request the processor issues a block read request to the cache line that holds the data element to be loaded, and waits until the external agent supplies read data in response to this read request. if it is necessary to write back the current cache line, the processor issues a request to write the current cache line. if the page attribute is write through, the processor issues a non-block write request. 12.7.3 store hit the operation in the system bus is determined by whether the cache line in question is writeback or write through. if the line uses the writeback policy, a processor request is not generated by a store hit. if the line uses the write- through policy, a non-block write request of store data is generated by a store hit. 12.7.4 load/store in uncached area when the processor executes loading from an uncached area, it issues a read request for a doubleword, an unaligned doubleword, a word, or an unaligned word. if the processor executes storing in an uncached area, it issues a write request for a doubleword, an unaligned doubleword, a word, or an unaligned word. all the write requests by the processor are buffered in a 4-stage write transaction buffer, and output to the system interface. because this buffer is a fifo, if the buffer has an entry when a read request is issued, processing of the read request is started after the buffer has become completely empty. 12.7.5 accelerated store in uncached area an accelerated operation to an uncached area is used to access a page with an uncached accelerated cache algorithm. when the processor executes an accelerated store operation to an uncached area, it can issue a block write request or a write request for one or more doublewords, an unaligned doubleword, a word, or an unaligned word. all the write requests by the processor are buffered in a 4-stage write transaction buffer and output to the system interface. because this buffer is a fifo, if the buffer has an entry when a read request is issued, processing of the read request is started after the buffer has become completely empty. by an accelerated operation to an uncached area, several sequential uncached word/doubleword accesses can be combined into one 32-byte block write operation that can be processed by one external sysad bus transaction. when organizing a system, utmost care must be exercised in locating data that is used to access an uncached accelerated page, so that this transaction is effectively performed. an accelerated write operation to an uncached area is buffered in the write transaction buffer on a fifo basis, in the same way as the other transactions. if the data used for an accelerated write operation on an uncached area is
chapter 12 overview of system interface preliminary user?s manual u16044ej1v0um 233 located in accordance with the following rules, however, two or more consecutive transactions are combined on a fifo basis and processed as a 4-doubleword access. ? if the first target of the accelerated operation to the uncached area is located at a 32-byte boundary ? if all the accelerated operations to the uncached area to be processed are word or doubleword accesses ? if the target of the word or doubleword access to be processed is located at a word boundary or doubleword boundary ? in the case of word access, if the targets are located consecutively at a doubleword boundary ? if the address value is incremented sequentially a write transaction to an uncached area that is not in compliance with these rules is not treated as an accelerated operation. if the transactions for an accelerated operation include a transaction that does not comply with the above rules, all the transactions are processed as an ordinary uncached word/doubleword access. an accelerated operation to an uncached area is aborted when the processor enters the debug mode. in the debug mode, the contents of the write transaction buffer are cleared. if an exception occurs, the accelerated operation to the uncached area is also aborted. 12.7.6 instruction fetch from uncached area the processor issues a word read to fetch an instruction in an uncached area. therefore, the system rom address space that is accessed while booting of the processor is being resumed must support an aligned 32-bit read operation. 12.7.7 fetch miss if a miss occurs in the instruction cache while an instruction is being fetched, the processor issues a read request to obtain a cache line. the external agent returns data as a read response.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 234 12.8 error check function 12.8.1 parity error check the v r 5500 performs error detection only, using an even parity. parity error detection is the most simple error detection method. by suffixing 1 bit called a parity bit to the end of data, an error of 1 bit can be detected. however, the error cannot be corrected. parity comes in the following two types. ? odd parity is used to append a bit of 1 to data when the number of 1s in the data is even, making the total number of 1s, including that of the parity bit, odd. ? even parity is used to append a bit of 1 to data when the number of 1s in the data is odd, making the total number of 1s, including that of the parity bit, even. here is an example of odd parity and even parity. data(3:0) odd parity bit even parity bit 0010 0 1 in this example, only one bit that is 1, data1, is in data(3:0). ? even parity sets the parity bit to 1. as a result, the number of bits that are 1 is two (even). ? odd parity sets the parity bit to 0. as a result, the number of bits that are 1 remains odd (only the one bit of data1). here is an example of odd parity and even parity for various data values. data(3:0) odd parity bit even parity bit 0110 1 0 0000 1 0 1111 1 0 1101 0 1 parity can detect an error of 1 bit but cannot identify the bit that has the error. for example, if a value 00011 is received as odd parity, this data has an error because the last bit is the parity bit and the number of 1s, which should be odd, is even. however, which bit has the error is unknown.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 235 12.8.2 error check operation the processor uses parity to check the accuracy of data when it transfers data between the system interface and cache. (1) system interface bus the processor generates an accurate check bit for the data of a word or an unaligned word that is to be transferred to the system interface. it does not change the data check bit of the cache and directly passes it to the system interface because only the accuracy of the data is to be checked. the processor does not check the data of an external write operation it receives from the system interface. the processor can also be set to not check the data of a read response it received from the system interface by setting the syscmd4 bit of a data identifier. the processor does not check an address it has received from the system interface, and does not generate a check bit for the address to be transferred to the system interface. the v r 5500 does not have a circuit that corrects data. if an error is detected in accordance with the data check bit, a cache error exception occurs. perform error processing by software. (2) system interface command bus the v r 5500 does not have a function to check the data of the system interface command bus.
chapter 12 overview of system interface preliminary user ? s manual u16044ej1v0um 236 (3) outline of error check operation tables 12-4 and 12-5 outline the error check operation. table 12-4. error check for internal transaction transaction bus uncached load uncached store cache load from system interface system interface write from cache cache instruction processor data from system not checked not c hanged, from system interface checked, and trap occurs in case of error checked when cache is written back, and trap occurs in case of error system address, command, check bit during transfer not generated not generated not generated not generated not generated system address, command, check bit during reception not checked not checked not checked not checked not checked system interface data checked, and trap occurs in case of error from processor specified word is checked, and trap occurs in case of error from cache from cache system interface data check bit checked, and trap occurs in case of error generated specified word is checked, and trap occurs in case of error from cache from cache table 12-5. error check for external transaction transaction bus external write processor data disabled system address, command, check bit during transfer disabled system address, command, check bit during reception not checked system interface data not checked system interface data check bit not checked
preliminary user?s manual u16044ej1v0um 237 chapter 13 system interface (64-bit bus mode) this chapter explains the request protocol of the system interface in the 64-bit bus normal mode. the system interface of the v r 5500 can be set in the 64-bit bus mode by inputting a high level to the busmode pin before a power-on reset. it can also be set in the normal mode by inputting a high level to the o3return# pin before a power- on reset, and in the out-of-order return mode by inputting a low level to the same pin. the 64-bit bus normal mode is also called the r5000 mode, in which the v r 5500 is compatible with the bus protocol of the v r 5000 series. to set this mode, input a high level to the dwbtrans# and disdvalido# pins before a power-on reset. v r 5500 bus mode 64-bit bus mode v r 5432 native mode busmode = h o3return# = l o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = l o3return# = h, dwbtrans# = l, disdvalido# = l busmode = l 32-bit bus mode out-of-order return mode r5000 mode out-of-order return mode r5000 mode (compatible with rm523x) for the protocol in the 32-bit bus normal modes (operation mode compatible with native mode of the v r 5432 and the rm523x), refer to chapter 14 system interface (32-bit bus mode) . for the protocol in the out-of- order return mode, refer to chapter 15 system interface (out-of-order return mode) .
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 238 13.1 protocol of processor requests this section explains the following two processor request protocols. ? read ? write 13.1.1 processor read request protocol the following sequence explains the protocol of a processor read request for a doubleword, unaligned doubleword, word, and unaligned word (the numbers correspond to the numbers in figure 13-1). <1> the external agent makes the rdrdy# signal is low and is ready to acknowledge a read request. <2> when the system interface is in the master status, the processor issues a processor read request by driving a read command onto the syscmd bus and a read address onto the sysad bus. a physical address is driven onto sysad(35:0). all the other bits are driven to 0. <3> at the same time, the processor asserts the validout# signal for the duration of 1 cycle. this signal indicates that valid data is on the syscmd and sysad buses. <4> the processor puts the system interface in the uncompelled slave status. the external agent must wait without asserting the extrqst# signal in an attempt to return a read response, until transition of the system interface to the uncompelled slave status is completed. <5> the processor releases the syscmd and sysad buses 1 cycle after the release# signal has been asserted. <6> the external agent drives the syscmd and sysad buses 2 cycles after the release# signal has been asserted. when the system interface has been put in the slave status, the external agent can return the requested data by using a read response. the read response can also return an indication that an error has occurred in the data if the requested data could not be searched correctly, as well as the requested data. if the returned data contains an error, the processor generates a bus error exception. figure 13-1 shows the processor read request, and uncompelled transition to the slave status that takes place when the read request is issued. the timing of the sysadc bus is the same as that of the sysad bus.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 239 figure 13-1. processor read request 123 4 56 7 8 9 10 11 12 addr l read syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) rdrdy# (input) validout# (output) release# (output) <5> <1> <2> <3> <6> <4> master slave remark the dotted line indicates high impedance. after the release# signal has been asserted (<6> and later in the figure), the processor can acknowledge both a read response (if the read request is pending) and an external request. 13.1.2 processor write request protocol the processor write request is issued by using either of the following two protocols. ? a write request for a doubleword, word, or unaligned word uses a single write request protocol. ? cache block write and uncached accelerated write uses a block write request protocol. a processor write request is issued when the system interface is in the master status. figure 13-2 shows the processor single write request cycle and figure 13-3 shows the processor block write request cycle (the numbers in the explanation below correspond to the numbers in the figures). <1> the external agent makes the wrrdy# signal low and is ready to acknowledge a write request. <2> the processor issues a processor write request by driving a write command onto the syscmd bus and a write address onto the sysad bus. a physical address is driven onto sysad(35:0). all the other bits are driven to 0. <3> the processor asserts the validout# signal. <4> the processor drives a data identifier onto the syscmd bus and data onto the sysad bus. <5> the data identifier corresponding to the data cycle must include an indication of the last data cycle. at the end of the cycle, the validout# signal is deasserted. remark the timing of the sysadc bus is the same as that of the sysad bus.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 240 figure 13-2. processor non-block write request protocol syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 1234567 8 9 10 11 12 addr data0 write l neod <1> <4> <5> <2> <3> master figure 13-3. processor block write request 123 4 56 7 8 9 10 11 12 addr data0 data3 write l ndata data2 data1 ndata ndata syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) <1> <4> <5> <2> <3> master neod
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 241 13.1.3 control of processor request flow the external agent uses the rdrdy# signal to control the flow of processor read requests. figure 13-4 shows the control of the read request flow (the numbers in the explanation below correspond to the numbers in the figure). <1> the processor samples the rdrdy# signal and determines whether the external agent can acknowledge a read request. <2> the processor issues a read request to the external agent. <3> the external agent deasserts the rdrdy# signal. this signal indicates that no more read requests can be acknowledged. <4> because the rdrdy# signal is deasserted two cycles before, issuance of the read request is stalled. <5> the read request is issued again to the external agent. figure 13-4. control of processor request flow syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) validin# (input) rdrdy# (input) release# (output) 1234567 8910 addr addr read read master master slave slave data neod <3> <2> <4> <5> <1> unsd unsd 11 data neod 12 13 remark the dotted line indicates high impedance.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 242 figure 13-5 shows an example in which two processor write requests are issued but issuance of the second request is delayed because of the condition of the wrrdy# signal (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> the processor asserts the validout# signal, and drives a write command onto the syscmd bus and a write address onto the sysad bus. <3> the second write request is delayed until the wrrdy# signal is asserted again. <4> if the wrrdy# signal is active two cycles before, an address cycle is issued in response to the processor write request. this completes the issuance of the write request. remark the timing of the sysadc bus is the same as that of the sysad bus. figure 13-5. timing when second processor write request is delayed 1234 56 7 8 9 10 11 12 addr write data neod data write neod addr <1> <2> master <4> <3> syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 13.1.4 timing mode of processor request the v r 5500 has three timing modes: v r 4000-compatible mode, write re-issuance mode, and pipeline write mode. ? v r 4000-compatible mode if single write requests are successively issued, the processor inserts two unused cycles after the data cycle so that an address cycle is issued once every 4 system cycles. ? write re-issuance mode if the wrrdy# signal is deasserted in the address cycle of a write request, that request is discarded, but the processor issues the same write request again. ? pipeline write mode even if the wrrdy# signal is deasserted in the address cycle of a write request, the processor assumes that it has issued that request.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 243 (1) v r 4000-compatible mode with the v r 5500 processor interface, the wrrdy# signal must be asserted two system clocks before issuance of a write cycle. if the wrrdy# signal is deasserted immediately after the external agent has received a write request that fills the buffer, the subsequent write requests are kept waiting for the duration of 4 system cycles. the processor inserts at least two unused system cycles after a write address/data pair, giving the external agent the time to keep the next write request waiting. figure 13-6 shows a back-to-back write cycle in the v r 4000-compatible mode (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the wrrdy# signal to indicate that it is ready to issue a write cycle. <2> the wrrdy# signal remains active. this indicates that the external agent can acknowledge another write request. <3> the wrrdy# signal is deasserted. this indicates that the external agent cannot acknowledge any more write requests, and that issuance of the next write request is stalled. figure 13-6. timing of v r 4000-compatible back-to-back write cycle syscycle sysclock (input) sysad(63:0) (i/o) validout# (output) wrrdy# (input) 1234 1 cycle 2 3 4 567 8 9 10 11 12 13 14 addr data unsd unsd unsd unsd addr data addr data write#1 write#2 write#3 <1> <2> <3> master
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 244 (2) write re-issuance mode figure 13-7 shows the write re-issuance protocol (the numbers in the explanation below correspond to the numbers in the figure). a write request is issued when the wrrdy# signal is asserted two cycles before the address cycle and in the address cycle. <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> the wrrdy# signal remains active even when the write request has been issued. this indicates that the external agent can acknowledge another write request. <3> the wrrdy# signal is deasserted in the address cycle. this write cycle is aborted. <4> the external agent asserts the wrrdy# signal, indicating that it is ready to acknowledge a write request. in response, the write request aborted in <3> is re-issued. <5> even if a write request is issued, the wrrdy# signal remains active. this indicates that the external agent can acknowledge another write request. figure 13-7. write re-issuance syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (output) 123 issued re- issued not issued not issued not issued not issued 4567 891011 addr0 data0 addr1 data1 addr1 unsd write neod write neod write unsd <1> <4> <5> <2> <3> master 12 13 14 data1 neod
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 245 (3) pipeline write mode figure 13-8 shows the pipeline write protocol (the numbers in the explanation below correspond to the numbers in the figure). if the wrrdy# signal is issued two cycles before the address cycle, a write request is issued. after the wrrdy# signal has been deasserted, the external agent must acknowledge one more write request. <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> even when the write request has been issued, the wrrdy# signal remains active. this indicates that the external agent can acknowledge one more write request. <3> the wrrdy# signal is deasserted. this indicates that the external agent can acknowledge no more write requests. however, this write request is acknowledged. <4> the external agent asserts the wrrdy# signal, indicating that it can acknowledge a write request. figure 13-8. pipeline write syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 123 issued issued not issued not issued not issued 4567 8 9 10 11 addr0 data0 addr1 data1 addr2 unsd write neod write neod write unsd <1> <4> <2> <3> master 12 13 14 data2 neod issued
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 246 13.2 protocol of external request an external request can be issued only when the system interface is in the slave status. arbitration that changes the status of the system interface from master to slave is realized by using the handshake signals of the system interface (extrqst# and release#). this section explains the following external request protocols, as well as the arbitration protocol. ? null ? write ? read response 13.2.1 external arbitration protocol to issue an external request, assert the extrqst# signal to arbitrate the system interface. then wait until the processor asserts the release# signal and releases the system interface to the slave status. when the system interface is already in the slave status, i.e., when the processor previously executed an uncompelled transition of the system interface to the slave status, the external agent can immediately start issuing an external request. after issuing an external request, the external agent must return the right to control the system interface to the processor. if the external agent does not have any more external requests that must be processed, it must deassert the extrqst# signal two cycles after the release# signal was asserted. to issue two or more requests in a row, the extrqst# signal must be kept active until the last request cycle. if the last request cycle lasts for two cycles or more after the release# signal was asserted, deassert the extrqst# signal. while the extrqst# signal is active, the processor continues processing the external request. however, the processor cannot release the system interface to process the next external request until processing of the current request is finished. while the extrqst# signal is active, two or more successive external requests cannot be interrupted by a processor request.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 247 figure 13-9 shows the arbitration protocol of an external request issued by the external agent. the following sequence explains the arbitration protocol (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent continues asserting the extrqst# signal to issue an external request. <2> the processor asserts the release# signal for 1 cycle when it is ready to process the external request. <3> the processor makes the sysad and syscmd buses go into a high-impedance state. <4> the external agent must drive the sysad and syscmd buses at least two cycles after the release# signal was asserted. <5> the external agent must deassert the extrqst# signal two cycles after the release# signal was asserted, except when it executes another external request. <6> the external agent must make the sysad and syscmd buses go into a high-impedance state on completion of the external request. remark the timing of the sysadc bus is the same as that of the sysad bus. figure 13-9. external request arbitration protocol 123 4 syscycle sysclock (input) sysad(63:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validln# (input) <3> addr extrqst# (input) release# (output) master slave master data0 cmd neod <2> <4> <5> <1> <6> remark the dotted line indicates high impedance.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 248 13.2.2 external null request protocol the processor supports an external null request. this request only returns the system interface from the slave status to the master status, and does not have any other influence on the processor. figure 13-10 shows the timing of the external null request (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent drives an external null request command onto the syscmd bus and asserts the validin# signal for one cycle. this returns the right to control the system interface to the processor. <2> the sysad bus is not used in the address cycle corresponding to the external null request (the bus does not hold valid data). <3> when the address cycle is issued, the null request is completed. the external null request returns the system interface to the master status when the external agent has released the syscmd and sysad buses. figure 13-10. external null request protocol 1234 syscycle sysclock (input) sysad(63:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validout# (output) validln# (input) extrqst# (input) slave master unsd sinull release# (output) h h h <1> <2> <3> <1> remark the dotted line indicates high impedance.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 249 13.2.3 external write request protocol the external write request performs an operation close to the processor single write request, except that it asserts the validin# signal, instead of the validout# signal. figure 13-11 shows the timing of the external write request (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the extrqst# signal to arbitrate the system interface. <2> the processor asserts the release# signal to release the system interface to the slave status. <3> the external agent asserts the validin# signal and drives a write command onto the syscmd bus and a write address onto the sysad bus. <4> the external agent asserts the validin# signal and drives a data identifier onto the syscmd bus and data onto the sysad bus. <5> the data identifier corresponding to the data cycle must contain an indication of the last data cycle. <6> when the data cycle is issued, the write request is completed. the external agent makes the syscmd and sysad buses go into a high-impedance state, and returns the system interface to the master status. remark the timing of the sysadc bus is the same as that of the sysad bus. the external write request can only write word data to the processor. if a data element other than a word is specified for the external write request, the operation of the processor is undefined. figure 13-11. external write request protocol 123 4 syscycle sysclock (input) sysad(63:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validln# (input) addr extrqst# (input) release# (output) master slave master data0 write neod validout# (output) h <1> <2> <3> <5> <6> <4> <3> <4> remark the dotted line indicates high impedance.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 250 13.2.4 read response protocol the external agent must return data to the processor by using a read response protocol, in response to a processor read request. the following sequence explains the read response protocol (the numbers in the explanation below correspond to the numbers in figures 13-12 and 13-13). <1> the external agent waits until the processor puts the system interface in the uncompelled slave status. <2> the processor returns data via a single data cycle or a series of data cycles. <3> when the last data cycle is issued, the read response is completed, and the external agent makes the syscmd and sysad buses go into a high-impedance state. <4> the system interface returns to the master status. remark when the read request is issued, the processor always puts the system interface in the uncompelled slave status. <5> the data identifier of the data cycle must indicate that this data is response data. <6> the data identifier corresponding to the last data cycle must contain an indication of the last data cycle. if the read response is for a block read request, the response data does not have to identify the initial cache status. the processor automatically allocates the cache to the clean status. the data identifier corresponding to the data cycle can indicate that the data transferred in that cycle has an error. even if data may have an error, however, the external agent must return a data block of the correct size. the processor checks the error bit of only the first doubleword of the block, and ignores the rest of the error bits of that block (refer to 13.2.5 sysadc(7:0) protocol for block read response ). only when there is a pending processor read request, read response data is passed to the processor. the operation of the processor is undefined if there is no pending processor read request when a read response is received. figure 13-12 shows a processor word request and the word read response that follows. figure 13-13 shows the read response to a processor block read request when the system interface is already in the slave status. remark the timing of the sysadc bus is the same as that of the sysad bus.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 251 figure 13-12. protocol of read request and read response 123 4 56 7 8 9 10 11 12 addr read data0 neod h master slave master <1> <2> <3> <6> <4> syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validln# (input) extrqst# (input) release# (output) validout# (output) remark the dotted line indicates high impedance. figure 13-13. block read response in slave status 123 4 56 7 8 9 10 11 12 ndata neod ndata ndata data0 data1 data2 data3 h syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validin# (input) validout# (output) release# (output) master slave <5> <2> <3> <6> <4> h h extrqst# (input) <5> <5> remark the dotted line indicates high impedance.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 252 13.2.5 sysadc(7:0) protocol for block read response when a block read response is issued, sysadc(7:0) must be used in compliance with the following rules. ? only the first doubleword of transfer data is checked. if the data has an error (syscmd5 = 1), the cache line is invalidated, and a bus error exception occurs in the processor. ? a parity error of the first doubleword is detected when a request is issues, and a cache error exception occurs. at this time, the cache line is in the invalid status. a parity error of a subsequent doubleword is detected again when that data is used. ? the error bits in three subsequent doublewords of data are ignored. the parity of each doubleword is written to the cache, but is not checked until the data is referenced. ? if a memory error occurs during a block read operation, the sysadc bit must be changed to an illegal parity during a read response operation for all the bytes that are affected by the memory error. however, even if syscmd5 is set to 1 during data transfer other than the first doubleword, a bus error exception does not occur. if the sysadc bit has been changed to an illegal parity, a cache error exception occurs when any of the remaining three doublewords is referenced. 13.3 data flow control the system interface supports a data rate of 1 doubleword per cycle. 13.3.1 data rate control the external agent can send data to the processor at the maximum data rate of the system interface. the rate at which data is to be sent to the processor can be controlled on the external agent side. the transfer rate from the external agent is not limited. the external agent asserts the validin# signal in the cycle in which it transfers data. when the validin# signal has been asserted and as long as a data identifier is on the syscmd bus, the processor acknowledges the cycle as valid. it then goes on acknowledging data until it receives a data word with neod. the operation of the processor is undefined if data is sent in a pattern of other than 1 cycle for single data, and other than 4 cycles for block data. figure 13-14 shows the timing of the read response where the data rate pattern is ddx.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 253 figure 13-14. read response with data rate pattern ddx syscycle sysclock (input) sysad(63:0) (i/o) syscmd(8:0) (i/o) validout# (output) validin# (input) release# (output) 1234567 8910 data0 ndata slave 11 12 h extrqst# (input) h data2 ndata data3 h data1 ndata ndata master remark the dotted line indicates high impedance. 13.3.2 block write data transfer pattern the rate at which the processor transfers block write data to the external agent can be set by the ep bit of the config register after reset. the data pattern is indicated by characters d and x that indicate the array of data cycle and unused cycle at each data rate. d indicates a data cycle, and x indicates an unused cycle. for example, dxx data pattern indicates a data rate of 1 doubleword in every 3 cycles. table 13-1 shows the maximum data rate that can be set after reset. table 13-1. transfer data rate and data pattern maximum data rate data pattern 1 doubleword/1 cycle dddd 2 doublewords/3 cycles ddxddx 2 doublewords/4 cycles ddxxddxx 1 doubleword/2 cycles dxdxdxdx 2 doublewords/5 cycles ddxxxddxxx 2 doublewords/6 cycles ddxxxxddxxxx 1 doubleword/3 cycles dxxdxxdxxdxx 2 doublewords/8 cycles ddxxxxxxddxxxxxx 1 doubleword/4 cycles dxxxdxxxdxxxdxxx 13.3.3 system endianness the endianness of the system is set by the bigendian pin after reset. the set endianness is indicated by the be bit of the config register.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 254 13.4 independent transfer with sysad bus for general applications, the sysad bus connects the processor and a bidirectional register type transceiver in the external agent between two points. for such applications, only the processor and external agent can be connected to the sysad bus. for specific applications, other drivers and receivers are connected to the sysad bus so that transfer can be performed independently of the processor on the sysad bus. this is called independent transfer. to execute independent transfer, the external agent must adjust the right to control the sysad bus by using the arbitration handshake signals and external null request. the procedure of independent transfer of the sysad bus is as follows. <1> the external agent requests the right to control the sysad bus by asserting the extrqst# signal to issue an external request. <2> the processor releases the system interface to the slave status by asserting the release# signal. <3> in this way, the external agent can execute independent transfer on the sysad bus. the validin# signal must not be asserted during transfer. <4> when transfer is completed, the external agent releases and returns the system interface to the master status by issuing an external null request. 13.5 system interface cycle time because processor requests are restricted by the system interface protocol, the number of request cycles is checked by the protocol. because external requests have the following two types of wait times, the number of request cycles differs depending on these wait times. ? standby time until the processor releases the system interface to the slave status in response to an external request (release wait time) ? response time of the external request that requires a response (external response wait time) while an external request is being issued, the release wait time differs depending on the status of the system interface. when the external request is detected, the system interface is released to the external agent after the cycle under processing. the external response time of the v r 5500 is kept to the minimum. data that is written is immediately loaded.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 255 13.6 system interface commands and data identifiers a system interface command defines the type and attribute of a system interface request. this definition is indicated in the address cycle of a request. the system interface data identifier defines the attribute of the data transferred in the system interface data cycle. this section explains the syntax of the commands and data identifiers of the system interface, i.e., coding in bit units. set the reserved bits and reserved area in the commands and data identifiers of the system interface related to external requests to 1. the reserved bits and reserved area in the commands and data identifiers of the system interface related to processor requests are undefined. 13.6.1 syntax of commands and data identifiers the commands and data identifiers of the system interface are coded in 9-bit units, and transferred from the processor to the external agent, or vice versa, via the syscmd bus in the address cycle and data cycle. syscmd8 (most significant bit) determines whether the current contents of the syscmd bus are a command (address cycle) or data identifier (data cycle). if they are a command, clear syscmd8 to 0; if they are a data identifier, set it to 1. 13.6.2 syntax of command this section explains the coding of the syscmd bus when a system interface command is used. figure 13-15 shows the common code used for all the system interface commands. figure 13-15. bit definition of system interface command 7 0 request type details of request 40 5 8 be sure to clear syscmd8 to 0 when a system interface command is used. syscmd(7:5) define the types of system interface requests such as read, write, and null. table 13-2. code of system interface command syscmd(7:5) bit contents syscmd(7:5) command 0: read request 1: reserved 2: write request 3: null request 4 to 7: reserved syscmd(4:0) are determined according to the type of request. a definition of each request is given below.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 256 (1) read request the code of the syscmd bus related to a read request is shown below. figure 13-16 shows the format of the command when a read request is issued. tables 13-3 to 13-5 show the code of the read attribute of the syscmd(4:0) bits related to the read request. figure 13-16. bit definition of syscmd bus during read request 8 0 000 details of read request (refer to the tables below) 74 50 table 13-3. code of syscmd(4:3) during read request bit contents syscmd(4:3) read attribute 0, 1: reserved 2: block read 3: single read table 13-4. code of syscmd(2:0) during block read request bit contents syscmd2 reserved syscmd(1:0) size of read block 0: reserved 1: 8 words 2, 3: reserved table 13-5. code of syscmd(2:0) during single read request bit contents syscmd(2:0) read data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word). 4: 5 bytes are valid. 5: 6 bytes are valid. 6: 7 bytes are valid. 7: 8 bytes are valid (doubleword).
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 257 (2) write request the code of the syscmd bus related to a write request is shown below. figure 13-17 shows the format of the command when a write request is issued. tables 13-6 to 13-8 show the code of the write attribute of the syscmd(4:0) bits related to the write request. figure 13-17. bit definition of syscmd bus during write request 8 0 010 details of write request (refer to the tables below) 74 50 table 13-6. code of syscmd(4:3) during write request bit contents syscmd(4:3) write attribute 0, 1: reserved 2: block write 3: single write table 13-7. code of syscmd(2:0) during block write request bit contents syscmd2 update of cache line 0: replaced 1: retained syscmd(1:0) size of write block 0: reserved 1: 8 words 2, 3: reserved table 13-8. code of syscmd(2:0) during single write request bit contents syscmd(2:0) write data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word). 4: 5 bytes are valid. 5: 6 bytes are valid. 6: 7 bytes are valid. 7: 8 bytes are valid (doubleword).
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 258 (3) null request figure 13-18 shows the format of the command when a null request is used. figure 13-18. bit definition of syscmd bus during null request 8 0 011 details of null request (refer to the table below) 74 50 table 13-9 shows the code of the syscmd(4:3) bits related to the null request. for the null request, the syscmd(2:0) bits are reserved. table 13-9. code of syscmd(4:3) during null request bit contents syscmd(4:3) null attribute 0: released 1 to 3: reserved 13.6.3 syntax of data identifier this section explains coding of the syscmd bus when a system interface data identifier is used. figure 13-19 shows the common code used for all system interface data identifiers. figure 13-19. bit definition of system interface data identifier 830 1 indication of last data indication of response data indication of error data data check enable reserved 4 5 6 7 be sure to set syscmd8 of the system interface data identifier to 1.
chapter 13 system interface (64-bit bus mode) preliminary user ? s manual u16044ej1v0um 259 a definition of the syscmd(7:0) bits is given below. syscmd7: indicates whether the data element is the last one. syscmd6: indicates whether the data is response data. response data is returned in response to a read request. syscmd5: indicates whether the data element contains an error. the error indicated in the data cannot be corrected. if this data is returned to the processor, a bus error exception occurs. in the case of a response block, send the entire line to the processor regardless of the degree of error. the processor checks syscmd5 of the first doubleword of the block response data. the external agent should ignore this bit in a processor data identifier because no error is indicated. syscmd4: this bit in an external data identifier indicates whether the data of the data element and check bit are checked. this bit in a processor data identifier is reserved. syscmd(3:0): these bits are reserved. table 13-10 indicates the codes of syscmd(7:5) of a processor data identifier, and table 13-11 shows the codes of syscmd(7:4) of an external data identifier. table 13-10. codes of syscmd(7:5) of processor data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred table 13-11. codes of syscmd(7:4) of external data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred syscmd4 data check enables 0: data and check bit checked 1: data and check bit not checked
chapter 13 system interface (64-bit bus mode) preliminary user?s manual u16044ej1v0um 260 13.7 system interface address the system interface address is a 36-bit physical address and is output to sysad(35:0) in the address cycle. the other bits of the sysad bus are not used in the address cycle. 13.7.1 address specification rules an address related to transferring data such as a word and an unaligned word is aligned in accordance with the size of the data element. the system uses the following address rules. ? an address related to the request of a block is aligned at the requested doubleword boundary. therefore, the lower 3 bits of the address are 0. ? the lower 3 bits of an address for a doubleword request are cleared to 0. ? the lower 2 bits of an address for a word request are cleared to 0. ? the least significant bit of an address for a halfword request cleared to 0. ? each request of 1, 3, 5, 6, and 7 bytes uses a byte address. 13.7.2 sub-block ordering the order of the data returned in response to a processor block read request is sub-block ordering. with sub- block ordering, the processor outputs the address of the doubleword required in a block. the external agent must return a block that starts with the specified doubleword, by using sub-block ordering (for details, refer to appendix a sub-block order ). for a block write request, the processor always outputs the address of the first doubleword in the block. it sequentially outputs the doublewords in the block, starting from the first doubleword of the block. in the data cycle, whether the byte line of an aligned doubleword (or byte, halfword, 3 bytes, word, 6 bytes, or 7 bytes) is valid or not depends on the position of the data. in the little-endian mode, for example, sysad(7:0) of a byte request where lower 3 address bits are 0 are valid in the data cycle. for the byte lane that is used when an unaligned word in big endian and little endian is transferred, refer to figure 3-3 byte specification related to load/store instruction . 13.7.3 processor internal address map for an external write, the external agent accesses the internal resources of the processor. when an external write request is made, the processor decodes the sysad(6:4) bits of the address that is output, to determine which of the resources of the processor is to be accessed. the only internal resource of the processor that can be accessed by an external write request is the interrupt register. access the interrupt register by an external write access, by specifying an address that clears sysad(6:4) to 000.
preliminary user?s manual u16044ej1v0um 261 chapter 14 system interface (32-bit bus mode) this chapter explains the request protocol of the system interface in the 32-bit bus normal mode. the system interface of the v r 5500 can be set in the 32-bit bus mode by inputting a low level to the busmode pin before a power-on reset. it can also be set in the normal mode by inputting a high level to the o3return# pin before a power- on reset, and in the out-of-order return mode by inputting a low level to the same pin. the 32-bit bus normal mode includes two protocol modes: r5000 mode and v r 5432 native mode. these modes can be selected according to the combination of levels input to the dwbtrans# and disdvalido# pins before a power-on reset. ? r5000 mode the r5000 mode is selected when a high level is input to both the dwbtrans# and disdvalido# pins. this mode is compatible with the bus protocol of the rm523x (a product of pmc-sierra). ? v r 5432 native mode the v r 5432 native mode is selected when a low level is input to both the dwbtrans# and disdvalido# pins. this mode is compatible with the bus protocol of the native mode of the v r 5432. v r 5500 bus mode 64-bit bus mode v r 5432 native mode busmode = h o3return# = l o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = l o3return# = h, dwbtrans# = l, disdvalido# = l busmode = l 32-bit bus mode out-of-order return mode r5000 mode out-of-order return mode r5000 mode (compatible with rm523x) for the protocol in the 64-bit bus normal modes (operation mode compatible with the v r 5000), refer to chapter 13 system interface (64-bit bus mode) . for the protocol in the out-of-order return mode, refer to chapter 15 system interface (out-of-order return mode) .
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 262 14.1 protocol of processor requests this section explains the following two processor request protocols. ? read ? write 14.1.1 processor read request protocol the following sequence explains the protocol of a processor read request for a doubleword, unaligned doubleword, word, and unaligned word (the numbers correspond to the numbers in figure 14-1). <1> the external agent makes the rdrdy# signal is low and is ready to acknowledge a read request. <2> when the system interface is in the master status, the processor issues a processor read request by driving a read command onto the syscmd bus and a read address (physical address) onto the sysad bus. <3> at the same time, the processor asserts the validout# signal for the duration of 1 cycle. this signal indicates that valid data is on the syscmd and sysad buses. <4> the processor puts the system interface in the uncompelled slave status. the external agent must wait without asserting the extrqst# signal in an attempt to return a read response, until transition of the system interface to the uncompelled slave status is completed. <5> the processor releases the syscmd and sysad buses 1 cycle after the release# signal has been asserted. <6> the external agent drives the syscmd and sysad buses 2 cycles after the release# signal has been asserted. when the system interface has been put in the slave status, the external agent can return the requested data by using a read response. the read response can also return an indication that an error has occurred in the data if the requested data could not be searched correctly, as well as the requested data. if the returned data contains an error, the processor generates a bus error exception. figure 14-1 shows the processor read request, and uncompelled transition to the slave status that takes place when the read request is issued. the timing of the sysadc bus is the same as that of the sysad bus.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 263 figure 14-1. processor read request 123 4 56 7 8 9 10 11 12 addr l read syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) rdrdy# (input) validout# (output) release# (output) <5> <1> <2> <3> <6> <4> master slave remark the dotted line indicates high impedance. after the release# signal has been asserted (<6> and later in the figure), the processor can acknowledge both a read response (if the read request is pending) and an external request. 14.1.2 processor write request protocol the processor write request is issued by using either of the following two protocols. ? a write request for a word or unaligned word uses a single write request protocol. ? cache block write and uncached accelerated write uses a block write request protocol. a processor write request is issued when the system interface is in the master status. figure 14-2 shows the processor single write request cycle and figure 14-3 shows the processor block write request cycle (the numbers in the explanation below correspond to the numbers in the figures). <1> the external agent makes the wrrdy# signal low and is ready to acknowledge a write request. <2> the processor issues a processor write request by driving a write command onto the syscmd bus and a write address onto the sysad bus. <3> the processor asserts the validout# signal. <4> the processor drives a data identifier onto the syscmd bus and data onto the sysad bus. <5> the data identifier corresponding to the data cycle must include an indication of the last data cycle. at the end of the cycle, the validout# signal is deasserted. remark the timing of the sysadc bus is the same as that of the sysad bus.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 264 figure 14-2. processor non-block write request protocol syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 1234567 8 9 10 11 12 addr data0 write l neod <1> <4> <5> <2> <3> master figure 14-3. processor block write request 123 4 56 7 8 9 10 11 12 addr data0 data3 write l ndata ndata data2 data1 ndata ndata syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) <1> <4> <5> <2> <3> master ndata ndata ndata neod data4 data5 data6 data7
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 265 14.1.3 control of processor request flow the external agent uses the rdrdy# signal to control the flow of processor read requests. figure 14-4 shows the control of the read request flow (the numbers in the explanation below correspond to the numbers in the figure). <1> the processor samples the rdrdy# signal and determines whether the external agent can acknowledge a read request. <2> the processor issues a read request to the external agent. <3> the external agent deasserts the rdrdy# signal. this signal indicates that no more read requests can be acknowledged. <4> because the rdrdy# signal is deasserted two cycles before, issuance of the read request is stalled. <5> the read request is issued again to the external agent. figure 14-4. control of processor request flow (1/2) (a) r5000 mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) validin# (input) rdrdy# (input) release# (output) 1234567 8910 addr data addr data read neod read neod master master slave slave 11 12 <3> <2> <4> <5> <1> unsd unsd remark the dotted line indicates high impedance.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 266 figure 14-4. control of processor request flow (2/2) (b) v r 5432 native mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) validin# (input) rdrdy# (input) release# (output) 1234567 8910 addr data addr data read neod read neod master master slave slave 11 12 <3> <2> <4> <5> <1> unsd unsd remark the dotted line indicates high impedance. figure 14-5 shows an example in which two processor write requests are issued but issuance of the second request is delayed because of the condition of the wrrdy# signal (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> the processor asserts the validout# signal, and drives a write command onto the syscmd bus and a write address onto the sysad bus. <3> the second write request is delayed until the wrrdy# signal is asserted again. <4> if the wrrdy# signal is active two cycles before, an address cycle is issued in response to the processor write request. this completes the issuance of the write request. remark the timing of the sysadc bus is the same as that of the sysad bus.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 267 figure 14-5. timing when second processor write request is delayed (a) r5000 mode 1234 56 7 8 9 10 11 12 addr write data neod data write neod addr <1> <2> master <4> <3> syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) (b) v r 5432 native mode 1234 56 7 8 9 10 11 12 addr write data neod data write neod addr <1> <2> master <4> <3> syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 14.1.4 timing mode of processor request the v r 5500 has three timing modes: v r 4000-compatible mode, write re-issuance mode, and pipeline write mode. ? v r 4000-compatible mode if single write requests are successively issued, the processor inserts two unused cycles after the data cycle so that an address cycle is issued once every 4 system cycles. ? write re-issuance mode if the wrrdy# signal is deasserted in the address cycle of a write request, that request is discarded, but the processor issues the same write request again. ? pipeline write mode even if the wrrdy# signal is deasserted in the address cycle of a write request, the processor assumes that it has issued that request.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 268 (1) v r 4000-compatible mode with the v r 5500 processor interface, the wrrdy# signal must be asserted two system clocks before issuance of a write cycle. if the wrrdy# signal is deasserted immediately after the external agent has received a write request that fills the buffer, the subsequent write requests are kept waiting for the duration of 4 system cycles in the v r 4000 non-block-write-compatible mode. the processor inserts at least two unused system cycles after a write address/data pair, giving the external agent the time to keep the next write request waiting. figure 14-6 shows a back-to-back write cycle in the v r 4000-compatible mode (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the wrrdy# signal to indicate that it is ready to issue a write cycle. <2> the wrrdy# signal remains active. this indicates that the external agent can acknowledge another write request. <3> the wrrdy# signal is deasserted. this indicates that the external agent cannot acknowledge any more write requests, and that issuance of the next write request is stalled. figure 14-6. timing of v r 4000-compatible back-to-back write cycle (a) r5000 mode syscycle sysclock (input) sysad(31:0) (i/o) validout# (output) wrrdy# (input) 1234 1 cycle 2 3 4 567 8 9 10 11 12 13 14 addr data unsd unsd unsd unsd addr data addr data write#1 write#2 write#3 <1> <2> <3> master (b) v r 5432 native mode syscycle sysclock (input) sysad(31:0) (i/o) validout# (output) wrrdy# (input) 1234 1 cycle 2 3 4 567 8 9 10 11 12 13 14 addr data unsd unsd unsd unsd addr data addr data write#1 write#2 write#3 <1> <2> <3> master
chapter 14 system interface (32-bit bus mode) preliminary user?s manual u16044ej1v0um 269 (2) write re-issuance mode figure 14-7 shows the write re-issuance protocol (the numbers in the explanation below correspond to the numbers in the figure). a write request is issued when the wrrdy# signal is asserted two cycles before the address cycle and in the address cycle. <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> the wrrdy# signal remains active even when the write request has been issued. this indicates that the external agent can acknowledge another write request. <3> the wrrdy# signal is deasserted in the address cycle. this write cycle is aborted. <4> the external agent asserts the wrrdy# signal, indicating that it is ready to acknowledge a write request. in response, the write request aborted in <3> is re-issued. <5> even if a write request is issued, the wrrdy# signal remains active. this indicates that the external agent can acknowledge another write request. figure 14-7. write re-issuance (a) r5000 mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 123 issued re- issued not issued not issued not issued not issued 4567 8 9 10 11 addr0 data0 addr1 data1 addr1 unsd write neod write neod write unsd <1> <4> <5> <2> <3> master 12 13 14 data1 neod (b) v r 5432 native mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 123 issued re- issued not issued not issued not issued not issued 4567 8 9 10 11 addr0 data0 addr1 data1 addr1 unsd write neod write neod write unsd <1> <4> <5> <2> <3> master 12 13 14 data1 neod
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 270 (3) pipeline write mode figure 14-8 shows the pipeline write protocol (the numbers in the explanation below correspond to the numbers in the figure). if the wrrdy# signal is issued two cycles before the address cycle, a write request is issued. after the wrrdy# signal has been deasserted, the external agent must acknowledge one more write request. <1> the external agent asserts the wrrdy# signal to indicate that it is ready to acknowledge a write request. <2> even when the write request has been issued, the wrrdy# signal remains active. this indicates that the external agent can acknowledge one more write request. <3> the wrrdy# signal is deasserted. this indicates that the external agent can acknowledge no more write requests. however, this write request is acknowledged. <4> the external agent asserts the wrrdy# signal, indicating that it can acknowledge a write request. figure 14-8. pipeline write (a) r5000 mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 123 issued issued not issued not issued not issued 4567 891011 addr0 data0 addr1 data1 addr2 unsd write neod write neod write unsd <1> <4> <2> <3> master 12 13 14 data2 neod issued (b) v r 5432 native mode syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) wrrdy# (input) 123 issued issued not issued not issued not issued 4567 8 9 10 11 addr0 data0 addr1 data1 addr2 unsd write neod write neod write unsd <1> <4> <2> <3> master 12 13 14 data2 neod issued
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 271 14.2 protocol of external request an external request can be issued only when the system interface is in the slave status. arbitration that changes the status of the system interface from master to slave is realized by using the handshake signals of the system interface (extrqst# and release#). this section explains the following external request protocols, as well as the arbitration protocol. ? null ? write ? read response 14.2.1 external arbitration protocol to issue an external request, assert the extrqst# signal to arbitrate the system interface. then wait until the processor asserts the release# signal and releases the system interface to the slave status. when the system interface is already in the slave status, i.e., when the processor previously executed an uncompelled transition of the system interface to the slave status, the external agent can immediately start issuing an external request. after issuing an external request, the external agent must return the right to control the system interface to the processor. if the external agent does not have any more external requests that must be processed, it must deassert the extrqst# signal two cycles after the release# signal was asserted. to issue two or more requests in a row, the extrqst# signal must be kept active until the last request cycle. if the last request cycle lasts for two cycles or more after the release# signal was asserted, deassert the extrqst# signal. while the extrqst# signal is active, the processor continues processing the external request. however, the processor cannot release the system interface to process the next external request until processing of the current request is finished. while the extrqst# signal is active, two or more successive external requests cannot be interrupted by a processor request.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 272 figure 14-9 shows the arbitration protocol of an external request issued by the external agent. the following sequence explains the arbitration protocol (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent continues asserting the extrqst# signal to issue an external request. <2> the processor asserts the release# signal for 1 cycle when it is ready to process the external request. <3> the processor makes the sysad and syscmd buses go into a high-impedance state. <4> the external agent must drive the sysad and syscmd buses at least two cycles after the release# signal was asserted. <5> the external agent must deassert the extrqst# signal two cycles after the release# signal was asserted, except when it executes another external request. <6> the external agent must make the sysad and syscmd buses go into a high-impedance state on completion of the external request. remarks 1. the processor can issue a request one cycle after the external agent has set the system interface to a high-impedance state. 2. the timing of the sysadc bus is the same as that of the sysad bus. figure 14-9. external request arbitration protocol 123 4 syscycle sysclock (input) sysad(31:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validln# (input) <3> addr extrqst# (input) release# (output) master slave master data0 cmd neod <2> <4> <5> <1> <6> remark the dotted line indicates high impedance.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 273 14.2.2 external null request protocol the processor supports an external null request. this request only returns the system interface from the slave status to the master status, and does not have any other influence on the processor. figure 14-10 shows the timing of the external null request (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent drives an external null request command onto the syscmd bus and asserts the validin# signal for one cycle. this returns the right to control the system interface to the processor. <2> the sysad bus is not used in the address cycle corresponding to the external null request (the bus does not hold valid data). <3> when the address cycle is issued, the null request is completed. the external null request returns the system interface to the master status when the external agent has released the syscmd and sysad buses. figure 14-10. external null request protocol 1234 syscycle sysclock (input) sysad(31:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validout# (output) validln# (input) extrqst# (input) slave master unsd sinull release# (output) h h h <1> <2> <3> <1> remark the dotted line indicates high impedance.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 274 14.2.3 external write request protocol the external write request performs an operation close to the processor single write request, except that it asserts the validin# signal, instead of the validout# signal. figure 14-11 shows the timing of the external write request (the numbers in the explanation below correspond to the numbers in the figure). <1> the external agent asserts the extrqst# signal to arbitrate the system interface. <2> the processor asserts the release# signal to release the system interface to the slave status. <3> the external agent asserts the validin# signal and drives a write command onto the syscmd bus and a write address onto the sysad bus. <4> the external agent asserts the validin# signal and drives a data identifier onto the syscmd bus and data onto the sysad bus. <5> the data identifier corresponding to the data cycle must contain an indication of the last data cycle. <6> when the data cycle is issued, the write request is completed. the external agent makes the syscmd and sysad buses go into a high-impedance state, and returns the system interface to the master status. the external write request can only write word data to the processor. if a data element other than a word is specified for the external write request, the operation of the processor is undefined. figure 14-11. external write request protocol 123 4 syscycle sysclock (input) sysad(31:0) (i/o) 56 7 8 9 10 11 12 syscmd(8:0) (i/o) validln# (input) addr extrqst# (input) release# (output) master slave master data0 write neod validout# (output) h <1> <2> <3> <5> <6> <4> <3> <4> remark the dotted line indicates high impedance.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 275 14.2.4 read response protocol the external agent must return data to the processor by using a read response protocol, in response to a processor read request. the following sequence explains the read response protocol (the numbers in the explanation below correspond to the numbers in figures 14-12 and 14-13). <1> the external agent waits until the processor puts the system interface in the uncompelled slave status. <2> the processor returns data via a single data cycle or a series of data cycles. <3> when the last data cycle is issued, the read response is completed, and the external agent makes the syscmd and sysad buses go into a high-impedance state. <4> the system interface returns to the master status. remark when the read request is issued, the processor always puts the system interface in the uncompelled slave status. <5> the data identifier of the data cycle must indicate that this data is response data. <6> the data identifier corresponding to the last data cycle must contain an indication of the last data cycle. if the read response is for a block read request, the response data does not have to identify the initial cache status. the processor automatically allocates the cache to the clean status. the data identifier corresponding to the data cycle can indicate that the data transferred in that cycle has an error. even if data may have an error, however, the external agent must return a data block of the correct size. only when there is a pending processor read request, read response data is passed to the processor. the operation of the processor is undefined if there is no pending processor read request when a read response is received. figure 14-12 shows a processor word request and the word read response that follows. figure 14-13 shows the read response to a processor block read request when the system interface is already in the slave status. remark the timing of the sysadc bus is the same as that of the sysad bus.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 276 figure 14-12. protocol of read request and read response 123 4 56 7 8 9 10 11 12 addr read data0 neod h master slave master <1> <2> <3> <6> <4> syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validln# (input) extrqst# (input) release# (output) validout# (output) remark the dotted line indicates high impedance. figure 14-13. block read response in slave status 123 4 56 7 8 9 10 11 12 ndata ndata ndata ndata data0 data1 data2 data3 h syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validin# (input) validout# (output) release# (output) master slave <5> <2> <3> <6> <4> h h extrqst# (input) <5> <5> ndata ndata data4 data5 neod ndata data6 data7 <5> <5> <5> <5> remark the dotted line indicates high impedance.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 277 14.2.5 sysadc(3:0) protocol for block read response when a block read response is issued, sysadc(3:0) must be used in compliance with the following rules. ? only the first two words of transfer data are checked. if the data has an error (syscmd5 = 1), the cache line is invalidated, and a bus error exception occurs in the processor. ? a parity error of the first two words is detected when a request is issues, and a cache error exception occurs. at this time, the cache line is in the invalid status. a parity error of a subsequent word is detected again when that data is used. ? the error bits in six subsequent words of data are ignored. the parity of each word is written to the cache, but is not checked until the data is referenced. ? if a memory error occurs during a block read operation, the sysadc bit must be changed to an illegal parity during a read response operation for all the bytes that are affected by the memory error. however, even if syscmd5 is set to 1 during data transfer other than the first two words, a bus error exception does not occur. if the sysadc bit has been changed to an illegal parity, a cache error exception occurs when any of the remaining six words is referenced. 14.3 data flow control the system interface supports a data rate of 1 word per cycle. 14.3.1 data rate control the external agent can send data to the processor at the maximum data rate of the system interface. the rate at which data is to be sent to the processor can be controlled on the external agent side. the transfer rate from the external agent is not limited. the external agent asserts the validin# signal in the cycle in which it transfers data. when the validin# signal has been asserted and as long as a data identifier is on the syscmd bus, the processor acknowledges the cycle as valid. it then goes on acknowledging data until it receives a data word with neod. the operation of the processor is undefined if data is sent in a pattern of other than 1 cycle for single data, and other than 2 or 8 cycles for block data. figure 14-14 shows the timing of the read response where the data rate pattern is ddx.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 278 figure 14-14. read response with data rate pattern ddx syscycle sysclock (input) sysad(31:0) (i/o) syscmd(8:0) (i/o) validout# (output) validin# (input) release# (output) 1234567 8910 data0 ndata slave 11 12 h extrqst# (input) h data2 ndata data3 h 13 data6 data4 ndata data5 ndata data7 neod data1 ndata ndata ndata remark the dotted line indicates high impedance. 14.3.2 block write data transfer pattern the rate at which the processor transfers block write data to the external agent can be set by the ep bit of the config register after reset. the data pattern is indicated by characters d and x that indicate the array of data cycle and unused cycle at each data rate. d indicates a data cycle, and x indicates an unused cycle. for example, dxx data pattern indicates a data rate of 1 word in every 3 cycles. table 14-1 shows the maximum data rate that can be set after reset. table 14-1. transfer data rate and data pattern maximum data rate data pattern 1 word/1 cycle dddddddd 2 words/3 cycles ddxddxddxddx 2 words/4 cycles ddxxddxxddxxddxx 1 word/2 cycles dxdxdxdxdxdxdxdx 2 words/5 cycles ddxxxddxxxddxxxddxxx 2 words/6 cycles ddxxxxddxxxxddxxxxddxxxx 1 word/3 cycles dxxdxxdxxdxxdxxdxxdxxdxx 2 words/8 cycles ddxxxxxxddxxxxxxddxxxxxxddxxxxxx 1 word/4 cycles dxxxdxxxdxxxdxxxdxxxdxxxdxxxdxxx
chapter 14 system interface (32-bit bus mode) preliminary user?s manual u16044ej1v0um 279 14.3.3 word transfer sequence the v r 5500 transfers a 32-bit address in one address cycle and 32-bit data in one data cycle. it takes two system cycles to transfer each doubleword as a block. data is transferred in these two cycles in the following sequence. ? the lower 4 bytes (lower word) are transferred in the first data cycle in the little-endian mode, and in the second data cycle in the big-endian mode. ? the higher 4 bytes (higher word) are transferred in the second data cycle in the little-endian mode, and in the first data cycle in the big-endian mode. the v r 5500 can transfer a word or an unaligned word in one system cycle. the table below shows the transfer sequence in both the little-endian and big-endian modes to write a block, doubleword, unaligned doubleword, word, and unaligned word. table 14-2. data write sequence transfer type little endian big endian block 1. a(31:0) 2. d0(31:0) 3. d0(63:32) 4. d1(31:0) 5. d1(63:32) 6. d2(31:0) 7. d2(63:32) 8. d3(31:0) 9. d3(63:32) 1. a(31:0) 2. d0(63:32) 3. d0(31:0) 4. d1(63:32) 5. d1(31:0) 6. d2(63:32) 7. d2(31:0) 8. d3(63:32) 9. d3(31:0) doubleword (in r5000 mode) 1. a(31:0) 2. d(31:0) 3. a(31:0) 4. d(63:32) 1. a(31:0) 2. d(63:32) 3. a(31:0) 4. d(31:0) doubleword (in v r 5432 native mode) 1. a(31:0) 2. d(31:0) 3. d(63:32) 1. a(31:0) 2. d(63:32) 3. d(31:0) word or unaligned word 1. a(31:0) 2. w(31:0) 1. a(31:0) 2. w(31:0) remark a: address, d: doubleword, w: word dn: n+1th doubleword in block data (n = 0 to 3) dn(31:0): lower word of doubleword data dn(63:0) dn(63:32): higher word of doubleword data dn(63:0)
chapter 14 system interface (32-bit bus mode) preliminary user?s manual u16044ej1v0um 280 with the v r 5500, a doubleword is read in accordance with the sub-block order (refer to appendix a sub- block order ) when a cache line is obtained from the external agent and replaced. doubleword transfer in this case is treated as 2-word transfer in sub-block order. the other doublewords, unaligned doublewords, words, and unaligned words are read in the same sequence as when they are written. the table below shows the transfer sequence in both the little-endian and big-endian modes to read a block, doubleword, unaligned doubleword, word, and unaligned word. table 14-3. data read sequence (1/2) transfer type little endian big endian block (when a(4:3) = 00) 1. d0(31:0) 2. d0(63:32) 3. d1(31:0) 4. d1(63:32) 5. d2(31:0) 6. d2(63:32) 7. d3(31:0) 8. d3(63:32) 1. d0(63:32) 2. d0(31:0) 3. d1(63:32) 4. d1(31:0) 5. d2(63:32) 6. d2(31:0) 7. d3(63:32) 8. d3(31:0) block (when a(4:3) = 01) 1. d1(31:0) 2. d1(63:32) 3. d0(31:0) 4. d0(63:32) 5. d3(31:0) 6. d3(63:32) 7. d2(31:0) 8. d2(63:32) 1. d1(63:32) 2. d1(31:0) 3. d0(63:32) 4. d0(31:0) 5. d3(63:32) 6. d3(31:0) 7. d2(63:32) 8. d2(31:0) block (when a(4:3) = 10) 1. d2(31:0) 2. d2(63:32) 3. d3(31:0) 4. d3(63:32) 5. d0(31:0) 6. d0(63:32) 7. d1(31:0) 8. d1(63:32) 1. d2(63:32) 2. d2(31:0) 3. d3(63:32) 4. d3(31:0) 5. d0(63:32) 6. d0(31:0) 7. d1(63:32) 8. d1(31:0) remark a: address, d: doubleword, w: word dn: n+1th doubleword in block data (n = 0 to 3) dn(31:0): lower word of doubleword data dn(63:0) dn(63:32): higher word of doubleword data dn(63:0)
chapter 14 system interface (32-bit bus mode) preliminary user?s manual u16044ej1v0um 281 table 14-3. data read sequence (2/2) transfer type little endian big endian block (when a(4:3) = 11) 1. d3(31:0) 2. d3(63:32) 3. d2(31:0) 4. d2(63:32) 5. d1(31:0) 6. d1(63:32) 7. d0(31:0) 8. d0(63:32) 1. d3(63:32) 2. d3(31:0) 3. d2(63:32) 4. d2(31:0) 5. d1(63:32) 6. d1(31:0) 7. d0(63:32) 8. d0(31:0) doubleword (v r 5432 native mode) 1. d(31:0) 2. d(63:32) 1. d(63:32) 2. d(31:0) word, unaligned word 1. w(31:0) 1. w(31:0) remarks 1. doubleword read requests are not supported in r5000 mode. 2. a: address, d: doubleword, w: word dn: n+1th doubleword in block data (n = 0 to 3) dn(31:0): lower word of doubleword data dn(63:0) dn(63:32): higher word of doubleword data dn(63:0) the external agent can write 1 word of data to the v r 5500 at a time (refer to figure 14-11 ). therefore, it takes the external agent 1 system cycle to transfer a word to the v r 5500. 14.3.4 system endianness the endianness of the system is set by the bigendian pin after reset. the set endianness is indicated by the be bit of the config register.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 282 14.4 independent transfer with sysad bus for general applications, the sysad bus connects the processor and a bidirectional register type transceiver in the external agent between two points. for such applications, only the processor and external agent can be connected to the sysad bus. for specific applications, other drivers and receivers are connected to the sysad bus so that transfer can be performed independently of the processor on the sysad bus. this is called independent transfer. to execute independent transfer, the external agent must adjust the right to control the sysad bus by using the arbitration handshake signals and external null request. the procedure of independent transfer of the sysad bus is as follows. <1> the external agent requests the right to control the sysad bus by asserting the extrqst# signal to issue an external request. <2> the processor releases the system interface to the slave status by asserting the release# signal. <3> in this way, the external agent can execute independent transfer on the sysad bus. the validin# signal must not be asserted during transfer. <4> when transfer is completed, the external agent releases and returns the system interface to the master status by issuing an external null request. 14.5 system interface cycle time because processor requests are restricted by the system interface protocol, the number of request cycles is checked by the protocol. because external requests have the following two types of wait times, the number of request cycles differs depending on these wait times. ? standby time until the processor releases the system interface to the slave status in response to an external request (release wait time) ? response time of the external request that requires a response (external response wait time) while an external request is being issued, the release wait time differs depending on the status of the system interface. when the external request is detected, the system interface is released to the external agent after the cycle under processing. the external response time of the v r 5500 is kept to the minimum. data that is written is immediately loaded.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 283 14.6 system interface commands and data identifiers a system interface command defines the type and attribute of a system interface request. this definition is indicated in the address cycle of a request. the system interface data identifier defines the attribute of the data transferred in the system interface data cycle. this section explains the syntax of the commands and data identifiers of the system interface, i.e., coding in bit units. set the reserved bits and reserved area in the commands and data identifiers of the system interface related to external requests to 1. the reserved bits and reserved area in the commands and data identifiers of the system interface related to processor requests are undefined. 14.6.1 syntax of commands and data identifiers the commands and data identifiers of the system interface are coded in 9-bit units, and transferred from the processor to the external agent, or vice versa, via the syscmd bus in the address cycle and data cycle. syscmd8 (most significant bit) determines whether the current contents of the syscmd bus are a command (address cycle) or data identifier (data cycle). if they are a command, clear syscmd8 to 0; if they are a data identifier, set it to 1. 14.6.2 syntax of command this section explains the coding of the syscmd bus when a system interface command is used. figure 14-15 shows the common code used for all the system interface commands. figure 14-15. bit definition of system interface command 7 0 request type details of request 40 5 8 be sure to clear syscmd8 to 0 when a system interface command is used. syscmd(7:5) define the types of system interface requests such as read, write, and null. table 14-4. code of system interface command syscmd(7:5) bit contents syscmd(7:5) command 0: read request 1: reserved 2: write request 3: null request 4 to 7: reserved syscmd(4:0) are determined according to the type of request. a definition of each request is given below.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 284 (1) read request the code of the syscmd bus related to a read request is shown below. figure 14-16 shows the format of the command when a read request is issued. tables 14-5 to 14-7 show the code of the read attribute of the syscmd(4:0) bits related to the read request. figure 14-16. bit definition of syscmd bus during read request 8 0 000 details of read request (refer to the tables below) 74 50 table 14-5. code of syscmd(4:3) during read request bit contents syscmd(4:3) read attribute 0, 1: reserved 2: block read 3: single read table 14-6. code of syscmd(2:0) during block read request bit contents syscmd2 reserved syscmd(1:0) size of read block 0: 2 words (in v r 5432 native mode only) 1: 8 words 2, 3: reserved table 14-7. code of syscmd(2:0) during single read request bit contents syscmd2 reserved syscmd(1:0) read data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word).
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 285 (2) write request the code of the syscmd bus related to a write request is shown below. figure 14-17 shows the format of the command when a write request is issued. tables 14-8 to 14-10 show the code of the write attribute of the syscmd(4:0) bits related to the write request. figure 14-17. bit definition of syscmd bus during write request 8 0 010 details of write request (refer to the tables below) 74 50 table 14-8. code of syscmd(4:3) during write request bit contents syscmd(4:3) write attribute 0, 1: reserved 2: block write 3: single write table 14-9. code of syscmd(2:0) during block write request bit contents syscmd2 update of cache line 0: replaced 1: retained syscmd(1:0) size of write block 0: 2 words (in v r 5432 native mode only) 1: 8 words 2, 3: reserved table 14-10. code of syscmd(2:0) during single write request bit contents syscmd2 reserved syscmd(1:0) write data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word).
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 286 (3) null request figure 14-18 shows the format of the command when a null request is used. figure 14-18. bit definition of syscmd bus during null request 8 0 011 details of null request (refer to the table below) 74 50 table 14-11 shows the code of the syscmd(4:3) bits related to the null request. for the null request, the syscmd(2:0) bits are reserved. table 14-11. code of syscmd(4:3) during null request bit contents syscmd(4:3) null attribute 0: released 1 to 3: reserved 14.6.3 syntax of data identifier this section explains coding of the syscmd bus when a system interface data identifier is used. figure 14-19 shows the common code used for all system interface data identifiers. figure 14-19. bit definition of system interface data identifier 830 1 indication of last data indication of response data indication of error data data check enable reserved 4 5 6 7 be sure to set syscmd8 of the system interface data identifier to 1.
chapter 14 system interface (32-bit bus mode) preliminary user ? s manual u16044ej1v0um 287 a definition of the syscmd(7:0) bits is given below. syscmd7: indicates whether the data element is the last one. syscmd6: indicates whether the data is response data. response data is returned in response to a read request. syscmd5: indicates whether the data element contains an error. the error indicated in the data cannot be corrected. if this data is returned to the processor, a bus error exception occurs. in the case of a response block, send the entire line to the processor regardless of the degree of error. the external agent should ignore this bit in a processor data identifier because no error is indicated. syscmd4: this bit in an external data identifier indicates whether the data of the data element and check bit are checked. this bit in a processor data identifier is reserved. syscmd(3:0): these bits are reserved. table 14-12 indicates the codes of syscmd(7:5) of a processor data identifier, and table 14-13 shows the codes of syscmd(7:4) of an external data identifier. table 14-12. codes of syscmd(7:5) of processor data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred table 14-13. codes of syscmd(7:4) of external data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred syscmd4 data check enables 0: data and check bit checked 1: data and check bit not checked
chapter 14 system interface (32-bit bus mode) preliminary user?s manual u16044ej1v0um 288 14.7 system interface address the system interface address is a 32-bit physical address and is output in the address cycle, using all the bits of the sysad bus. 14.7.1 address specification rules an address related to transferring data such as a word and an unaligned word is aligned in accordance with the size of the data element. the system uses the following address rules. ? an address related to the request of a block is aligned at the requested doubleword boundary. therefore, the lower 3 bits of the address are 0. ? the lower 3 bits of an address for a doubleword request are cleared to 0. ? the lower 2 bits of an address for a word request are cleared to 0. ? the least significant bit of an address for a halfword request cleared to 0. ? each request of 1 and 3 bytes uses a byte address. 14.7.2 sub-block ordering the order of the data returned in response to a processor block read request is sub-block ordering. with sub- block ordering, the processor outputs the address of the doubleword required in a block. the external agent must return a block that starts with the specified doubleword, by using sub-block ordering (for details, refer to appendix a sub-block order ). for a block write request, the processor always outputs the address of the first doubleword in the block. it sequentially outputs the doublewords in the block, starting from the first doubleword of the block. remark the sequence of the data in a doubleword differs depending on the endianness (refer to tables 14-2 and 14-3 ). in the data cycle, whether the byte line of an aligned doubleword (or byte, halfword, 3 bytes, or word) is valid or not depends on the position of the data. in the little-endian mode, for example, sysad(7:0) of a byte request where lower 3 address bits are 0 are valid in the data cycle. for the byte lane that is used when an unaligned word in big endian and little endian is transferred, refer to figure 3-3 byte specification related to load/store instruction. 14.7.3 processor internal address map for an external write, the external agent accesses the internal resources of the processor. when an external write request is made, the processor decodes the sysad(6:4) bits of the address that is output, to determine which of the resources of the processor is to be accessed. the only internal resource of the processor that can be accessed by an external write request is the interrupt register. access the interrupt register by an external write access, by specifying an address that clears sysad(6:4) to 000.
preliminary user?s manual u16044ej1v0um 289 chapter 15 system interface (out-of-order return mode) this chapter explains the request protocol of the system interface in the 64-/32-bit out-of-order return mode. the system interface of the v r 5500 enters the out-of-order return mode when a low level is input to the o3return# pin before a power-on reset. v r 5500 bus mode 64-bit bus mode v r 5432 native mode busmode = h o3return# = l o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = h, dwbtrans# = h, disdvalido# = h o3return# = l o3return# = h, dwbtrans# = l, disdvalido# = l busmode = l 32-bit bus mode out-of-order return mode r5000 mode out-of-order return mode r5000 mode (compatible with rm523x) for the protocol in the normal mode (r5000 mode (operation mode compatible with the v r 5000 series and rm523x) and v r 5432 native mode), refer to chapter 13 system interface (64-bit bus mode) and chapter 14 system interface (32-bit bus mode) .
chapter 15 system interface (out-of-order ret urn mode) preliminary user?s manual u16044ej1v0um 290 15.1 overview in the out-of-order return mode, the external agent can return a response to a processor read request regardless of the order in which the request has been issued. each request is issued with an identification number attached. if the external agent returns response data along with this identification number, the processor verifies the returned data and request. the out-of-order return mode supports the following functions. ? two timing modes select either pipeline mode or re-issuance mode. ? response queue of up to five entries up to one instruction and four data entries can be managed. the request cycles, basic operation of the protocol, and events that generate requests in the out-of-order return mode are the same as those in the normal mode. for details of these, refer to chapter 13 system interface (64-bit bus mode) and chapter 14 system interface (32-bit bus mode) . 15.1.1 timing mode the out-of-order return mode has two timing modes: re-issuance mode and pipeline mode. these modes can be selected by using the em0 bit of the config register in cp0. in the out-of-order return mode, the setting of the em1 bit of the config register is ignored. ? pipeline mode the pipeline mode is selected when the em0 bit of the config register is cleared to 0. in this mode, even if the rdrdy#/wrrdy# signal is deasserted in the address cycle of a request, it is assumed that the request has been acknowledged. ? re-issuance mode the re-issuance mode is selected when the em0 bit of the config register is set to 1. in this mode, a request is discarded if the rdrdy#/wrrdy# signal is deasserted in the address cycle of the request, and the same request is re-issued when the rdrdy#/wrrdy# signal is asserted.
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 291 15.1.2 master status and slave status in the out-of-order return mode, the system interface changes its status from master to slave in the following cases. ? when the maximum five requests are stored in the response queue and the processor has no write request to issue. ? the processor has no requests after it has issued a read request. remark the processor cannot issue a request in the following cases. ? when the processor has no requests. ? when the processor has a read request but the rdrdy# signal is inactive. ? when the processor has a write request but the wrrdy# signal is inactive. when the system interface enters the slave status, the release# signal is asserted. therefore, the external agent must wait until the release# signal is asserted, and then obrain the right to control the system interface to start driving response data. even when the system interface is in the slave status, the processor can request the right to control the system interface by asserting the preq# signal. when the active level of the preq# signal is detected, the external agent can return the right to control the system interface to the processor by issuing a null request. at this time, the rdrdy#/wrrdy# signal must also be asserted, so that the processor can issue the subsequent request. if the rdrdy#/wrrdy# signal remains inactive, the system interface enters the slave status again even if it has entered the master status when the external agent issues the null request, without the processor issuing a request. even if the maximum five requests are stored in the response queue, the preq# signal is asserted if read/write requests are accumulated in the processor. the external agent must process the processor requests by issuing a null request before the number of requests waiting for a request reaches five. even if the external agent issues a null request when five requests are waiting for a response, processing of the requests does not proceed, and only the right to control the system interface is transferred. 15.1.3 identifying request the v r 5500 uses the sysid(2:0) signals to identify the contents of a read request issued in the out-of-order return mode. the sysid0 signal indicates whether reading an instruction or data is requested, and the sysid(2:1) signals indicate the request sequence (number). when reading an instruction is requested, the sysid(2:1) signals are always 00 (for details, refer to 15.4 request identifier ). the status of the sysid(2:0) signals is undefined when a write request is made.
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 292 15.2 protocol of out-of-order return mode this section explains the protocol of out-of-order return in the 64-bit bus mode. when using the 32-bit bus mode, read the sysad bus width as 32 bits. the data shown in table 15-1 is driven onto the sysad, syscmd, and sysid buses. the symbols in this table are used in the timing chart shown later. table 15-1. system interface bus data range symbol meaning common unsd unused addr physical address of id request sysad(64:0) data (m+1th element of) data of request of id read read request command of processor or external agent write write request command of processor or external agent null external null request command eod data identifier of last data element syscmd(8:0) data data identifier of data element other than last data element sysid(2:0) id read request identifier
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 293 15.2.1 successive read requests this section explains the protocol used in each mode when three processor read requests are issued in a row. (1) when processor read/write request follows in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> to <3> in figure 15-1 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> to <4>. at this time, request identifiers are also driven onto the sysid bus. in <4>, the external agent makes the rdrdy#/wrrdy# signal high, indicating that it can acknowledge no more read/write requests. however, the processor assumes that the request in the address cycle <4> has been acknowledged. the external agent can return a response from a request for which data has been prepared. when driving response data, also drive the corresponding request identifier onto the sysid bus. figure 15-1. successive read requests (in pipeline mode, with subsequent request) hi-z hi-z unsd unsd sysad(63:0) (i/o) sysclock syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master 123456789101112131415 addr0 unsd addr1 unsd data1 data2 data0 addr2 hi-z hi-z unsd <1> <2> <3> <4> unsd read unsd read unsd eod eod eod read hi-z hi-z unsd id0 unsd id1 unsd id1 id2 id0 id2 unsd slave master syscycle wrrdy# (input) <4>
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 294 (2) when processor read/write request does not follow in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> to <3> in figure 15-2 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> to <4>. at this time, request identifiers are also driven onto the sysid bus. even if the external agent makes the rdrdy# signal high in the address cycle <4>, indicating that it cannot acknowledge a read request, the processor assumes that this request has been acknowledged. the external agent can return a response from a request for which data has been prepared. when driving response data, also drive the corresponding request identifier onto the sysid bus. figure 15-2. successive read requests (in pipeline mode, without subsequent request) hi-z hi-z sysclock 12345678910111213 addr0 unsd addr1 unsd data1 data2 data0 addr2 hi-z hi-z read unsd read unsd eod eod eod read hi-z hi-z id0 unsd id1 unsd id1 id2 id0 id2 unsd unsd unsd sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master <1> <2> <3> <4> syscycle 14 15 16
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 295 (3) in re-issuance mode if the rdrdy# signal goes high in the address cycle in the re-issuance mode, the processor discards the request and re-issues it when it returns to the master status. <1> to <3> in figure 15-3 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> to <4>. at this time, request identifiers are also driven onto the sysid bus. if the external agent makes the rdrdy# signal high in the address cycle <4>, indicating that it cannot acknowledge a read request, the processor discards this request. when the processor later returns to the master status, it re-issues the request. the external agent can return a response from a request for which data has been prepared. when driving response data, also drive the corresponding request identifier onto the sysid bus. figure 15-3. successive read requests (in re-issuance mode) hi-z hi-z unsd unsd sysclock addr0 unsd addr1 unsd data1 data0 addr2 hi-z hi-z unsd <1> <2> <3> <4> unsd read unsd read unsd eod eod read hi-z hi-z unsd id0 unsd id1 unsd id1 id0 id2 unsd sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master 12345678910111213 syscycle 14 15 16
chapter 15 system interface (out-of-order ret urn mode) preliminary user?s manual u16044ej1v0um 296 15.2.2 successive write requests this section explains the protocol used in each mode when processor write requests are issued in a row. (1) in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the wrrdy# signal goes high in the address cycle. <1> to <3> in figure 15-4 indicate that the external agent makes the wrrdy# signal low, indicating that it is ready to acknowledge a write request. in response, the processor successively issues write requests in <2> to <4>. at this time, the status of the sysid bus is undefined. even if the external agent makes the wrrdy# signal high in the address cycle <4>, indicating that it cannot acknowledge a write request, the processor assumes that this request has been acknowledged. when the external agent makes the wrrdy# signal low in <5>, the processor completes issuance of the write request in <6>. figure 15-4. successive write requests (in pipeline mode) unsd sysclock 12345678910111213141516 addr0 data0 addr1 data1 data3 addr2 unsd addr3 write <1> h <2> <3> <4> write eod write eod eod write unsd note unsd unsd unsd data2 eod sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) wrrdy# (input) master <5> <6> syscycle note when the disdvalido# signal is low level
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 297 (2) in re-issuance mode if the wrrdy# signal goes high in the address cycle in the re-issuance mode, the processor discards the request and re-issues it when the wrrdy# signal goes low. <1> to <3> in figure 15-5 indicate that the external agent makes the wrrdy# signal low, indicating that it is ready to acknowledge a write request. in response, the processor successively issues write requests in <2> to <4>. at this time, the status of the sysid bus is undefined. if the external agent makes the wdrdy# signal high in the address cycle <4>, indicating that it cannot acknowledge a write request, the processor discards this request. when the external agent makes the wrrdy# signal low in <5>, the processor re-issues in <6> the request discarded in <4>, and completes issuance of the write request. figure 15-5. successive write requests (in re-issuance mode) unsd sysclock 12345678910111213141516 addr0 data0 addr1 data1 data2 addr2 unsd addr2 write <1> <2> <3> <4> write eod write eod eod write unsd unsd unsd unsd data2 eod h note sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) wrrdy# (input) master <5> <6> syscycle note when the disdvalido# signal is low level
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 298 15.2.3 write request following read request this section explains the protocol when a processor write request is issued immediately after a processor read request. <1> and <2> in figure 15-6 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, the request identifier is also driven onto the sysid bus. in <4>, the external agent makes the wrrdy# signal low, indicating that it is ready to acknowledge a write request. in response, the processor issues a write request in <5>. at this time, the status of the sysid bus is undefined. figure 15-6. write request following read request hi-z hi-z sysclock 123456789101112131415 addr0 unsd addr1 unsd data0 data1 addr2 hi-z hi-z read unsd read unsd eod eod write hi-z hi-z id0 unsd id1 unsd unsd unsd unsd id0 id1 data eod sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master wrrdy# (input) <1> <2> <3> <4> <5> syscycle
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 299 15.2.4 bus arbitration of processor this section explains the protocol in each mode when an external read response is aborted by asserting the preq# signal. (1) when processor read/write request follows in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> and <2> in figure 15-7 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. in <3>, the external agent makes the rdrdy#/wrrdy# signal high, indicating that it can acknowledge no more read/write requests. however, the processor assumes that the request in the address cycle <3> has been acknowledged. if the processor makes the preq# signal low while a response cycle is delayed because it takes time to prepare response data, the external agent can issue a null request (<4>) and return the right to control the system interface to the processor. by transferring the right of control in this way before the number of requests waiting for a response reaches five, requests can be efficiently processed. when the external agent makes the rdrdy#/wrrdy# signal low in <5>, the processor completes issuance of the read/write request in <6>. figure 15-7. bus arbitration of processor (in pipeline mode, with subsequent request) unsd unsd sysclock 12345678910111213141516 addr0 addr2 data0 addr1 unsd unsd hi-z hi-z unsd null read read eod read unsd unsd hi-z hi-z unsd id0 id2 id0 id1 unsd unsd hi-z hi-z sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master preq# (output) <1> <2> <3> <4> <5> <6> syscycle wrrdy# (input) <5> <6> <3>
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 300 (2) when processor read/write request does not follow in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> and <2> in figure 15-8 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. even if the external agent makes the rdrdy#/wrrdy# signal high in the address cycle <3>, indicating that it cannot acknowledge a read/write request, the processor assumes that this request has been acknowledged. if the processor makes the preq# signal low while a response cycle is delayed because it takes time to prepare response data, the external agent can issue a null request (<4>) and return the right to control the system interface to the processor. by transferring the right of control in this way before the number of requests waiting for a response reaches five, requests can be efficiently processed. when the external agent makes the rdrdy#/wrrdy# signal low in <5>, the processor completes issuance of the read/write request in <6>. figure 15-8. bus arbitration of processor (in pipeline mode, without subsequent request) unsd id0 id1 unsd unsd sysclock 12345678910111213141516 addr0 addr1 data1 hi-z addr2 hi-z hi-z unsd unsd null read read eod hi-z read hi-z hi-z unsd id1 hi-z id2 hi-z hi-z unsd sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) preq# (output) master slave master slave <1> <2> <3> <4> <5> <6> syscycle wrrdy# (input) <5> <6>
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 301 (3) in re-issuance mode if the rdrdy# signal goes high in the address cycle in the re-issuance mode, the processor discards the request and re-issues it when it returns to the master status. <1> and <2> in figure 15-9 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. if the external agent makes the rdrdy# signal high in the address cycle <3>, indicating that it cannot acknowledge a read request, the processor discards this request. if the processor makes the preq# signal low while a response cycle is delayed because it takes time to prepare response data, the external agent can issue a null request (<4>) and return the right to control the system interface to the processor. by transferring the right of control in this way before the number of requests waiting for a response reaches five, requests can be efficiently processed. when the external agent makes the rdrdy# signal low in <5>, the processor completes issuance of the read request in <6>. figure 15-9. bus arbitration of processor (in re-issuance mode) unsd unsd sysclock 12345678910111213141516 addr0 addr1 addr1 unsd unsd hi-z hi-z unsd null read read read unsd unsd hi-z hi-z unsd id0 id1 id1 unsd unsd hi-z hi-z sysad(63:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master preq# (output) <1> <2> <3> <4> <5> <6> syscycle
chapter 15 system interface (out-of-order ret urn mode) preliminary user?s manual u16044ej1v0um 302 15.2.5 single read request following block read request this section explains the protocol in each mode when a processor single read request is issued immediately after a processor block read request. (1) when processor read/write request follows in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> and <2> in figure 15-10 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. even if the external agent makes the rdrdy# signal high in the address cycle <3>, indicating that it cannot acknowledge a read request, the processor assumes that this request has been acknowledged. the external agent can return a response from a request for which data has been prepared. when driving response data, also drive the corresponding request identifier onto the sysid bus. figure 15-10. single read request following block read request (in pipeline mode, with subsequent request) unsd id0 id1 unsd 1234567891011121314 addr0 addr1 data02 hi-z hi-z unsd unsd unsd unsd read read data hi-z hi-z id1 id0 data00 data1 data01 data eod eod data data03 hi-z hi-z sysad(63:0) (i/o) sysclock syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master <1> <2> <3> slave master syscycle
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 303 (2) when processor read/write request does not follow in pipeline mode in the pipeline mode, the external agent must acknowledge a request even if the rdrdy# signal goes high in the address cycle. <1> and <2> in figure 15-11 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. even if the external agent makes the rdrdy# signal high in the address cycle <3>, indicating that it cannot acknowledge a read request, the processor assumes that this request has been acknowledged. the external agent can return a response from a request for which data has been prepared. when driving response data, also drive the corresponding request identifier onto the sysid bus. figure 15-11. single read request following block read request (in pipeline mode, without subsequent request) unsd id0 id1 unsd 123456789101112 addr0 addr1 data02 hi-z hi-z unsd read read data hi-z hi-z id1 id0 data00 data1 data01 data eod eod data data03 hi-z hi-z sysad(63:0) (i/o) sysclock syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master <1> <2> <3> slave master syscycle
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 304 (3) in re-issuance mode if the rdrdy# signal goes high in the address cycle in the re-issuance mode, the processor discards the request and re-issues it when it returns to the master status. <1> and <2> in figure 15-12 indicate that the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, request identifiers are also driven onto the sysid bus. if the external agent makes the rdrdy# signal high in the address cycle <4>, indicating that it cannot acknowledge a read request, the processor discards this request. figure 15-12. single read request following block read request (in re-issuance mode) unsd id0 id1 unsd 1234567891011121314 addr0 addr1 data02 hi-z hi-z unsd unsd unsd unsd read read data hi-z hi-z id0 data00 data01 data eod data data03 hi-z hi-z sysad(63:0) (i/o) sysclock syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master <1> <2> <3> slave master syscycle
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 305 15.2.6 unaligned 2-word read request this section explains the protocol when a read request of unaligned 2-word data is issued in the 32-bit bus mode. remark unaligned 2-word data is data of 5 to 8 bytes that is divided into 1 word and 1 to 4 bytes when processed. to read unaligned 2-word data, two read requests are successively issued, and the same request identifier is driven onto the sysid bus. the external agent must return response data in the same sequence as the corresponding request. in <1> and <2> in figure 15-13, the external agent makes the rdrdy# signal low, indicating that it is ready to acknowledge a read request. in response, the processor successively issues read requests in <2> and <3>. at this time, the same request identifier is driven twice onto the sysid bus. in <4> and <5>, the external agent must return the response data for which data has been prepared in the same sequence as the requests. when the response data is driven, the corresponding request identifier must also be driven onto the sysid bus. figure 15-13. unaligned 2-word read (in pipeline mode, with subsequent request) hi-z hi-z sysclock 123456789101112131415 addr0 unsd addr1 unsd data0 data1 hi-z hi-z read unsd read unsd eod eod hi-z hi-z id0 unsd id0 unsd id0 id0 sysad(31:0) (i/o) syscmd(8:0) (i/o) sysid(2:0) (i/o) validin# (input) validout# (output) release# (output) rdrdy# (input) master slave master <1> <2> <3> syscycle <4> <5>
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 306 15.3 system interface commands and data identifiers a system interface command defines the type and attribute of a system interface request. this definition is indicated in the address cycle of a request. the system interface data identifier defines the attribute of the data transferred in the system interface data cycle. this section explains the syntax of the commands and data identifiers of the system interface (coding in bit units) in the out-of-order return mode. set the reserved bits and reserved area in the commands and data identifiers of the system interface related to external requests to 1. the reserved bits and reserved area in the commands and data identifiers of the system interface related to processor requests are undefined. 15.3.1 syntax of commands and data identifiers the commands and data identifiers of the system interface are coded in 9-bit units, and transferred from the processor to the external agent, or vice versa, via the syscmd bus in the address cycle and data cycle. syscmd8 (most significant bit) determines whether the current contents of the syscmd bus are a command (address cycle) or data identifier (data cycle). if they are a command, clear syscmd8 to 0; if they are a data identifier, set it to 1. 15.3.2 syntax of command this section explains the coding of the syscmd bus when a system interface command is used. figure 15-14 shows the common code used for all the system interface commands. figure 15-14. bit definition of system interface command 7 0 request type details of request 40 5 8 be sure to clear syscmd8 to 0 when a system interface command is used. syscmd(7:5) define the types of system interface requests such as read, write, and null. table 15-2. code of system interface command syscmd(7:5) bit contents syscmd(7:5) command 0: read request 1: reserved 2: write request 3: null request 4 to 7: reserved syscmd(4:0) are determined according to the type of request. a definition of each request is given below.
chapter 15 system interface (out-of-order ret urn mode) preliminary user?s manual u16044ej1v0um 307 (1) read request the code of the syscmd bus related to a read request is shown below. figure 15-15 shows the format of the command when a read request is issued. tables 15-3 to 15-5 show the code of the read attribute of the syscmd(4:0) bits related to the read request. figure 15-15. bit definition of syscmd bus during read request 8 0 000 details of read request (refer to the tables below) 74 50 table 15-3. code of syscmd(4:3) during read request (a) in 64-bit bus mode bit contents syscmd(4:3) read attribute 0: reserved 1: reserved 2: block read 3: single read (b) in 32-bit bus mode bit contents syscmd(4:3) read attribute 0: reserved 1: unaligned 2-word read note 2: block read 3: single read note when an unaligned 2-word read request is issued, the processor drives the same request identifier twice onto the sysid bus. the external agent must return the response data to the unaligned 2-word read request in the same sequence as the request.
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 308 table 15-4. code of syscmd(2:0) during block read request (a) in 64-bit bus mode bit contents syscmd2 reserved syscmd(1:0) size of read block 0: reserved 1: 8 words 2, 3: reserved (b) in 32-bit bus mode bit contents syscmd2 reserved syscmd(1:0) size of read block 0: 2 words (only when the dwbtrans# signal is low level) 1: 8 words 2, 3: reserved table 15-5. code of syscmd(2:0) during single read request (a) in 64-bit bus mode bit contents syscmd(2:0) read data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word). 4: 5 bytes are valid. 5: 6 bytes are valid. 6: 7 bytes are valid. 7: 8 bytes are valid (doubleword). (b) in 32-bit bus mode bit contents syscmd2 reserved syscmd(1:0) read data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word).
chapter 15 system interface (out-of-order ret urn mode) preliminary user?s manual u16044ej1v0um 309 (2) write request the code of the syscmd bus related to a write request is shown below. figure 15-16 shows the format of the command when a write request is issued. tables 15-6 to 15-8 show the code of the write attribute of the syscmd(4:0) bits related to the write request. figure 15-16. bit definition of syscmd bus during write request 8 0 010 details of write request (refer to the tables below) 74 50 table 15-6. code of syscmd(4:3) during write request (a) in 64-bit bus mode bit contents syscmd(4:3) write attribute 0: reserved 1: reserved 2: block write 3: single write (b) in 32-bit bus mode bit contents syscmd(4:3) write attribute 0: reserved 1: unaligned 2-word write 2: block write 3: single write
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 310 table 15-7. code of syscmd(2:0) during block write request (a) in 64-bit bus mode bit contents syscmd2 update of cache line 0: replaced 1: retained syscmd(1:0) size of write block 0: reserved 1: 8 words 2, 3: reserved (b) in 32-bit bus mode bit contents syscmd2 update of cache line 0: replaced 1: retained syscmd(1:0) size of write block 0: 2 words (only when the dwbtrans# signal is low level) 1: 8 words 2, 3: reserved table 15-8. code of syscmd(2:0) during single write request (a) in 64-bit bus mode bit contents syscmd(2:0) write data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word). 4: 5 bytes are valid. 5: 6 bytes are valid. 6: 7 bytes are valid. 7: 8 bytes are valid (doubleword). (b) in 32-bit bus mode bit contents syscmd2 reserved syscmd(1:0) write data size 0: 1 byte is valid (byte). 1: 2 bytes are valid (halfword). 2: 3 bytes are valid. 3: 4 bytes are valid (word).
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 311 (3) null request figure 15-17 shows the format of the command when a null request is used. table 15-9 shows the code of the syscmd(4:3) bits related to the null request. for the null request, the syscmd(2:0) bits are reserved. figure 15-17. bit definition of syscmd bus during null request 8 0 011 details of null request (refer to the table below) 74 50 table 15-9. code of syscmd(4:3) during null request bit contents syscmd(4:3) null attribute 0: released 1 to 3: reserved 15.3.3 syntax of data identifier this section explains coding of the syscmd bus when a system interface data identifier is used. figure 15-18 shows the common code used for all system interface data identifiers. figure 15-18. bit definition of system interface data identifier 830 1 indication of last data indication of response data indication of error data data check enable reserved 4 5 6 7 be sure to set syscmd8 of the system interface data identifier to 1.
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 312 a definition of the syscmd(7:0) bits is given below. syscmd7: indicates whether the data element is the last one. syscmd6: indicates whether the data is response data. response data is returned in response to a read request. syscmd5: indicates whether the data element contains an error. the error indicated in the data cannot be corrected. if this data is returned to the processor, a bus error exception occurs. in the case of a response block, send the entire line to the processor regardless of the degree of error. the external agent should ignore this bit in a processor data identifier because no error is indicated. syscmd4: this bit in an external data identifier indicates whether the data of the data element and check bit are checked. this bit in a processor data identifier is reserved. syscmd(3:0): these bits are reserved. table 15-10 indicates the codes of syscmd(7:5) of a processor data identifier, and table 15-11 shows the codes of syscmd(7:4) of an external data identifier. table 15-10. codes of syscmd(7:5) of processor data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred table 15-11. codes of syscmd(7:4) of external data identifier bit contents syscmd7 indication of last data element 0: last data element 1: not last data element syscmd6 indication of response data 0: response data 1: not response data syscmd5 indication of error data 0: error occurred 1: no error occurred syscmd4 data check enables 0: data and check bit checked 1: data and check bit not checked remark to enable data check, clear the de bit of the status register in cp0 to 0.
chapter 15 system interface (out-of-order ret urn mode) preliminary user ? s manual u16044ej1v0um 313 15.4 request identifier in the out-of-order return mode, the processor drives a request identifier onto the sysid bus. the request identifier defines the target of the read request and sequence of issuance (id number). this definition is indicated in the address cycle of the request. the sysid bus is in an undefined state when a write request is issued. sysid0 (least significant bit) determines whether the data targeted to the current request is an instruction or data. sysid(2:1) defines the id number of the read request. tables 15-12 to 15-14 show the code of the request identifier. table 15-12. code of request identifier sysid0 bit contents sysid0 request target 0: instruction 1: data table 15-13. code of sysid(2:1) during instruction read bit contents sysid(2:1) request issuance sequence 0: id0 (first) 1 to 3: reserved table 15-14. code of sysid(2:1) during data read bit contents sysid(2:1) request issuance sequence 0: id0 (first) 1: id1 (second) 2: id2 (third) 3: id3 (fourth)
preliminary user?s manual u16044ej1v0um 314 chapter 16 interrupts this chapter explains the following four types of interrupts in the v r 5500. (1) non-maskable interrupt (nmi): 1 source (2) external ordinary interrupt: 6 sources (of which one is exclusive with a timer interrupt) (3) software interrupt: 2 sources (4) timer interrupt: 1 source (which is exclusive with one external ordinary interrupt) 16.1 interrupt request type 16.1.1 non-maskable interrupt (nmi) the nmi request is acknowledged when the nmi# signal is asserted, and execution branches to the reset exception vector. the nmi# signal is latched by an internal register at the rising edge of the sysclock signal as shown in figure 16-1. this signal is edge-triggered. this interrupt request can also be set by an external write request via the sysad bus. in the data cycle, sysad6 serves as an nmi request bit (1: request), and sysad22 serves as the write enable bit (1: enable) corresponding to sysad6. an nmi cannot be masked. figure 16-1 shows the internal processing of the nmi# signal. a low-level signal input to the nmi# pin is latched to an internal register at the rising edge of sysclock. the latched nmi# signal is inverted and ored with bit 6 of the internal register, and transmitted to the internal units as an nmi request. figure 16-1. nmi# signal 6 interrupt register (internal) nmi interrupt nmi# (internal register) sysclock external write request
chapter 16 interrupts preliminary user?s manual u16044ej1v0um 315 16.1.2 external ordinary interrupt this interrupt is acknowledged when the int(5:0)# signals are made low, which sets the ip(7:2) bits of the cause register. the int(5:0)# signals are level-triggered. keep these signals low until an interrupt exception occurs. after the interrupt exception has occurred, make high the signals that were low by the time execution returns to the normal routine, or before multiple interrupts are enabled. an external ordinary interrupt request can also be set by an external write request via the sysad bus. in the data cycle, sysad(5:0) serve as external interrupt request bits (1: request), and sysad(21:16) serve as write enable bits (1: enable) corresponding to sysad(5:0). after an interrupt exception has occurred, issue the external write request again before execution returns to the ordinary routine or multiple interrupts are enabled, and clear the corresponding bit of the interrupt register to 0. the interrupt request executed by int5# signal or sysad5 is acknowledged exclusively to the timer interrupt. if a low level is input to tintsel pin before a power-on reset, the interrupt request by int5# or sysad5 becomes valid. an external ordinary interrupt request can be masked by the im(7:2), ie, exl, and erl bits of the status register. 16.1.3 software interrupts software interrupt requests are acknowledged when bits 1 and 0 of the ip (interrupt pending) field in the cause register are set. these must be written by software; there is no hardware mechanism to set or clear these bits. after the occurrence of an interrupt exception, the corresponding bit of the ip field in the cause register must be cleared (0) before returning to the ordinary routine or before multiple interrupts are enabled. a software interrupt request can be masked by the im(1:0), ie, exl, and erl bits of the status register. 16.1.4 timer interrupt this interrupt request uses bit 7 in the ip (interrupt pending) area of the cause register. the ip7 bit is automatically set and the interrupt request is acknowledged if the value of the count register becomes equal to that of the compare register or if the performance counter overflows. the timer interrupt is acknowledged exclusively to the interrupt request executed by the int5# signal or sysad5. if a high level is input to tintsel pin before power-on reset, the timer interrupt request becomes valid. an timer interrupt request can be masked by the im7, ie, exl, and erl bits of the status register. 16.2 acknowledging interrupt request signal if the external agent issues an external write request that makes sysad(6:4) = 000, it is written to the interrupt register. this register can be used in the external write cycle but cannot be used in the external read cycle. when a request is written to the interrupt register, the processor ignores the address issued by the external agent. this register cannot be read or written by software, unlike the cp0 registers. in the data cycle, each bit of sysad(22:16) enables a write access to the corresponding bit of the interrupt register, allowing the values of sysad(6:0) to be written to the bits of the interrupt register. therefore, bits 0 to 6 of the interrupt register can be set or cleared by issuing an external write request only once. this mechanism is illustrated in figure 16-2, along with the nmi described above.
chapter 16 interrupts preliminary user ? s manual u16044ej1v0um 316 figure 16-2. bits of interrupt register and enable bits 4 sysad(5:0) write enable bit 3210 20 19 18 17 16 0 1 2 3 4 6 interrupt register (internal) refer to figures 16-3 and 16-4. refer to figure 16-1. 22 sysad6 write enable bit external interrupt request non-maskable interrupt request 6 5 21 5 sysad(6:0) sysad(22:16) bit function setting sysad(5:0) external interrupt request for each bit 1: request 0: no request sysad(21:16) write enable bits of sysad(5:0) for each bit 1: enabled 0: disabled sysad6 non-maskable interrupt request 1: request 0: no request sysad22 write enable bit of sysad6 1: enabled 0: disabled
chapter 16 interrupts preliminary user ? s manual u16044ej1v0um 317 16.2.1 detecting hardware interrupt figure 16-3 illustrates how a hardware interrupt request is detected by using the cause register. ? bit 15 (ip7) of the cause register is directly checked for the timer interrupt request. ? bits 15 to 10 (ip(7:2)) of the cause register are directly checked for external ordinary interrupt requests (int(5:0)# and sysad(5:0)). ? whether ip7 indicates the timer interrupt request or interrupt request executed by int5# or sysad5 is determined according to the status of the tintsel pin before a power-on reset. if this pin is high, it indicates the timer interrupt. if it is low, it indicates the interrupt request executed by int5# or sysad5. ip0 and ip1 of the cause register are used for software interrupt requests (for details, refer to chapter 6 exception processing ). software interrupts cannot be set or cleared by hardware. figure 16-3. hardware interrupt request signal 4 3210 ip2 ip3 ip4 ip5 ip6 interrupt register (internal) bits 15 to 10 of cause register 4 3210 ip7 timer interrupt refer to figure 16-4. (internal register) int4# int3# int2# int1# int0# tintsel 10 11 12 13 14 15 5 5 int5# selector
chapter 16 interrupts preliminary user ? s manual u16044ej1v0um 318 16.2.2 masking interrupt signal figure 16-4 illustrates how an interrupt signal is masked. ? bits 15 to 8 (ip(7:0)) of the cause register are connected to the interrupt mask bits (bits 15 to 8, i.e., im(7:0)) of the status register by an and-or logic block, masking each interrupt request signal. ? bit 0 of the status register is a global interrupt enable (ie) bit. the output of this bit is anded with the output of the and-or logic block to generate the interrupt request signals of the v r 5500. in addition, these interrupts are enabled by the exl and erl bits of the status register. figure 16-4. masking interrupt signal timer interrupt or external ordinary interrupt im0 ie software interrupt status register bit 0 im1 im2 im3 im4 im5 im6 im7 ip0 ip1 ip2 ip3 ip4 ip5 ip6 ip7 external ordinary interrupt status register bits 15 to 8 cause register bits 15 to 8 8 and-or block and block 1 1 interrupt of v r 5500 8 8 9 10 11 12 13 14 15 8 9 10 11 12 13 14 15 bit function setting ie enables all interrupts. 1: enables 0: disables im(7:0) interrupt mask for each bit 1: enabled 0: disabled ip(7:0) interrupt request for each bit 1: request pending 0: not pending
preliminary user?s manual u16044ej1v0um 319 chapter 17 cpu instruction set this chapter provides a detailed description of the operation of the cpu instruction in both 32- and 64-bit modes. the instructions are listed in alphabetical order. for details of the fpu instruction set, refer to chapter 18 fpu instruction set . 17.1 instruction notation conventions in this chapter, all variable subfields in an instruction format (such as rs , rt , immediate , etc.) are shown in lowercase names. the instruction names (e.g. add and sub) are indicated by upper-case characters. for the sake of clarity, we sometimes use an alias for a variable subfield in the formats of specific instructions. for example, we use base instead of rs in the format for load and store instructions. such an alias is always lower case, since it refers to a variable subfield. the architecture level at which the instruction was defined first is indicated on the right of the instruction format. the product name is also shown for instructions that may be incorporated differently depending on the product. figures with the actual bit encoding for all the mnemonics are located at the end of this chapter ( 17.4 cpu instruction opcode bit encoding ), and the bit encoding also accompanies each instruction. in the instruction descriptions that follow, the operation section describes the operation performed by each instruction using a high-level language notation. the v r 5500 can operate as either a 32- or 64-bit microprocessor and the operation for both modes is included with the instruction description. special symbols used in the notation are described in table 17-1.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 320 table 17-1. cpu instruction operation notations symbol meaning assignment || bit string concatenation x y replication of bit value x into a y -bit string. x is always a single-bit value x y..z selection of bits y to z of bit string x . little-endian bit notation is always used. if y is less than z , this expression is an empty (zero length) bit string + 2?s complement or floating-point addition ? 2?s complement or floating-point subtraction * 2?s complement or floating-point multiplication div 2?s complement integer division mod 2?s complement modulo / floating-point division < 2?s complement less than comparison and bit-wise logical and or bit-wise logical or xor bit-wise logical xor nor bit-wise logical nor gpr[ x ] general-purpose register x. the content of gpr[0] is always zero. attempts to alter the content of gpr[0] have no effect. cpr[ z, x ] coprocessor unit z , general-purpose register x. ccr[ z, x ] coprocessor unit z , control register x . coc[ z ] coprocessor unit z condition signal. bigendianmem big-endian mode as configured at reset (0 little, 1 big). specifies the endianness of the memory interface (see table 17-2 load and store common functions ), and the endianness in kernel and supervisor mode. the status of the be bit of the config register is reflected. reverseendian signal to reverse the endianness of load and store instructions. the status of bit 25 of the status register is reflected. this value is always 0 in the v r 5500. bigendiancpu the endianness for load and store instructions (0 little, 1 big). this variable is computed as bigendianmem xor reverseendian. t + i : indicates the time steps between operations. each of the statements within a time step are defined to be executed in sequential order (as modified by conditional and loop constructs). operations which are marked t + i : are executed at instruction cycle i relative to the start of execution of the instruction. thus, an instruction which starts at time j executes operations marked t + i: at time i + j . the interpretation of the order of execution between two instructions or two operations that execute at the same time should be pessimistic; the order is not defined.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 321 the following examples illustrate the application of some of the instruction notation conventions: example 1: gpr [rt] immediate || 0 16 sixteen zero bits are concatenated with an immediate value (typically 16 bits), and the 32-bit string is assigned to general-purpose register rt . example 2: (immediate 15 ) 16 || immediate 15...0 bit 15 (the sign bit) of an immediate value is extended for 16-bit positions, and the result is concatenated with bits 15 to 0 of the immediate value to form a 32-bit sign extended value. 17.2 cautions on using cpu instructions 17.2.1 load and store instructions the instruction immediately after a load instruction can use the contents of a register that has been loaded, but execution of that instruction may be delayed. the v r 5500 can cover the load delay using an out-of-order mechanism, but it is recommended to schedule the load delay slot to improve the performance. with the v r 5500, two special instructions, a load link instruction and a conditional store instruction, can be used. however, these instructions are used in a carefully programmed sequence when one of the synchronous primitives (such as test & set, lock of bit level, semaphore, and sequencer/event counter) is executed. these instructions are defined in the v r 5500 to maintain compatibility with the other processors. in the load and store descriptions, the functions listed below are used to summarize the handling of virtual addresses and physical memory. table 17-2. load and store common functions function meaning addresstranslation uses the tlb to find the physical address given the virtual address. the function fails and a tlb refill exception occurs if the required translation is not present in the tlb. loadmemory uses the cache and main memory to find the contents of the word containing the specified physical address. the lower 6 bits of the address and the access type field indicate which of each of the four bytes within the data word need to be returned. if the cache is enabled for this access, the entire word is returned and loaded to the cache. if the specified data is short of word length, the data position to which the contents of the specified data is stored is determined considering the endian mode and reverse-endian mode. storememory uses the cache, write buffer, and main memory to store the word or part of word specified as data in the word containing the specified physical address. the lower 3 bits of the address and the access type field indicate which of each of the four bytes within the data word should be stored. if the specified data is short of word length, the data position to which the contents of the specified data is stored is determined considering the endian mode and reverse-endian mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 322 the access type field indicates the size of the data item to be loaded or stored. regardless of access type or byte-numbering order (endian), the address specifies the byte that has the smallest byte address in the addressed field. the access type field is the leftmost byte in a big-endian system, and includes a 2?s complement sign value. this field is the rightmost byte in a little-endian system. table 17-3. access type specifications for loads/stores access type syscmd(2:0) meaning doubleword 7 8 bytes (64 bits) septibyte 6 7 bytes (56 bits) sextibyte 5 6 bytes (48 bits) quintibyte 4 5 bytes (40 bits) word 3 4 bytes (32 bits) triplebyte 2 3 bytes (24 bits) halfword 1 2 bytes (16 bits) byte 0 1 byte (8 bits) the bytes within the addressed doubleword that are used can be determined directly from the access type and the lower 3 bits of the address. 17.2.2 jump and branch instructions the jump and branch instructions have a branch delay slot. a jump or branch instruction cannot be used in a delay slot. if used, the error is not detected and the results of such an operation are undefined. if an exception or interrupt prevents the completion of a legal instruction during a delay slot, the hardware sets the epc register to point at the jump or branch instruction that precedes it. when the code is restarted, both the jump or branch instructions and the instruction in the delay slot are reexecuted. because jump and branch instructions may be restarted after exceptions or interrupts, they must be restartable. therefore, when a jump or branch instruction stores a return link value, cpu general-purpose register r31 (the register in which the link is stored) may not be used as a source register. since instructions must be word-aligned, a jump register or jump and link register instruction must use a register which contains a content (address) whose lower 2 bits are zero. if the lower 2 bits are not zero, an address error exception will occur when the jump target instruction is subsequently fetched. 17.2.3 coprocessor instructions the coprocessor is an alternate execution unit and has a register file independent of that of the cpu. the mips architecture allows four coprocessor units to be defined. each of these coprocessors has two register spaces, and each register space has thirty-two 32-bit registers. the coprocessor instructions modify the registers in either of the spaces. ? coprocessor general-purpose registers are allocated in the first space. these registers directly load/store data from/in the main memory. they can also be used to transfer data between coprocessors. ? coprocessor control registers are allocated in the second space. these registers can transfer their contents only between coprocessors.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 323 17.2.4 system control coprocessor (cp0) instructions there are some special limitations imposed on operations involving cp0 that is incorporated within the cpu. although load and store instructions to transfer data to/from coprocessors and to move control to/from coprocessor instructions are generally permitted by the mips architecture, cp0 is given a somewhat protected status since it has responsibility for exception handling and memory management. therefore, the move to/from coprocessor instructions are the only valid mechanism for writing to and reading from the cp0 registers. several cp0 instructions are defined to directly read, write, and probe tlb entries and to modify the operating modes in preparation for returning to user mode or interrupt-enabled states. 17.3 cpu instruction this section describes the functions of cpu instructions in detail for both 32-bit address mode and 64-bit address mode. the exception that may occur by executing each instruction is shown in the last of each instruction's description. for details of exceptions and their processes, see chapter 6 exception processing .
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 324 add add 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 add 100000 format: add rd, rs, rt mips i purpose: adds 32-bit integers. a trap is performed if an overflow occurs. description: the contents of general-purpose register rs and the contents of general-purpose register rt are added and the result is stored in general-purpose register rd . in 64-bit mode, the operands must be valid sign-extended, 32-bit values. an integer overflow exception occurs if the carries out of bits 30 and 31 differ (2's complement overflow). the destination register rd is not modified when an integer overflow exception occurs. operation: 32 t: gpr[rd] gpr[rs] + gpr[rt] 64 t: temp gpr[rs] + gpr[rt] gpr[rd] (temp 31 ) 32 || temp 31..0 exceptions: integer overflow exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 325 addi add immediate 26 31 addi 001000 0 25 rs rt 21 20 16 15 immediate format: addi rt, rs, immediate mips i purpose: adds a 32-bit integer to a constant. a trap is performed if an overflow occurs. description: the 16-bit immediate is sign-extended and added to the contents of general-purpose register rs and the result is stored in general-purpose register rt. in 64-bit mode, the operand must be valid sign-extended, 32-bit values. an integer overflow exception occurs if carries out of bits 30 and 31 differ (2?s complement overflow). the destination register rt is not modified when an integer overflow exception occurs. operation: 32 t: gpr[rt] gpr[rs] + (immediate 15 ) 16 || immediate 15..0 64 t: temp gpr[rs] + (immediate 15 ) 48 || immediate 15..0 gpr[rt] (temp 31 ) 32 || temp 31..0 exceptions: integer overflow exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 326 addiu add immediate unsigned 26 31 addiu 001001 0 25 rs rt 21 20 16 15 immediate format: addiu rt, rs, immediate mips i purpose: adds a 32-bit integer to a constant. description: the 16-bit immediate is sign-extended and added to the contents of general-purpose register rs and the result is stored in general-purpose register rt. no integer overflow exception occurs under any circumstances. in 64-bit mode, the operand must be valid sign-extended, 32-bit values. the only difference between this instruction and the addi instruction is that addiu never causes an integer overflow exception. operation: 32 t: gpr[rt] gpr[rs] + (immediate 15 ) 16 || immediate 15..0 64 t: temp gpr[rs] + (immediate 15 ) 48 || immediate 15..0 gpr[rt] (temp 31 ) 32 || temp 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 327 addu add unsigned 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 addu 100001 format: addu rd, rs, rt mips i purpose: adds 32-bit integers. description: the contents of general-purpose register rs and the contents of general-purpose register rt are added and the result is stored in general-purpose register rd . no integer overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid sign-extended, 32-bit values. the only difference between this instruction and the add instruction is that addu never causes an integer overflow exception. operation: 32 t: gpr[rd] gpr[rs] + gpr[rt] 64 t: temp gpr[rs] + gpr[rt] gpr[rd] (temp 31 ) 32 || temp 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 328 and and 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 and 100100 format: and rd, rs, rt mips i purpose: performs a bit-wise logical and operation. description: the contents of general-purpose register rs are combined with the contents of general-purpose register rt in a bit- wise logical and operation. the result is stored in general-purpose register rd . operation: 32 t: gpr[rd] gpr[rs] and gpr[rt] 64 t: gpr[rd] gpr[rs] and gpr[rt] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 329 andi and immediate 26 31 andi 001100 0 25 rs rt 21 20 16 15 immediate format: andi rt, rs, immediate mips i purpose: performs a bit-wise logical and operation with a constant. description: the 16-bit immediate is zero-extended and combined with the contents of general-purpose register rs in a bit-wise logical and operation. the result is stored in general-purpose register rt . operation: 32 t: gpr[rt] 0 16 || (immediate and gpr[rs] 15..0 ) 64 t: gpr[rt] 0 48 || (immediate and gpr[rs] 15..0 ) exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 330 bc0f branch on coprocessor 0 false 26 31 cop0 010000 0 25 21 20 16 15 offset bc 01000 bcf 00000 format: bc0f offset mips i purpose: tests the cp0 condition code and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if contents of cp0's condition signal (cpcond), as sampled during the previous instruction, is false, then the program branches to the target address with a delay of one instruction. because the condition line is sampled during the previous instruction, there must be at least one instruction between this instruction and a coprocessor instruction that changes the condition line. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t ? 1: condition not cop0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target endif 64 t ? 1: condition not cop0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target endif exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 331 bc0fl branch on coprocessor 0 false likely 26 31 cop0 010000 0 25 21 20 16 15 offset bc 01000 bcfl 00010 format: bc0fl offset mips ii purpose: tests the cp0 condition code and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of cp0's condition (cpcond) line, as sampled during the previous instruction, is false, the target address is branched to with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bc0f instruction. operation: 32 t ? 1: condition not cop0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t ? 1: condition not cop0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 332 bc0t branch on coprocessor 0 true 26 31 cop0 010000 0 25 21 20 16 15 offset bc 01000 bct 00001 format: bc0t offset mips i purpose: tests the cp0 condition code and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of cp0's condition signal (cpcond) that is sampled during the previous instruction is true, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t ? 1: condition cop0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target endif 64 t ? 1: condition cop0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target endif exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 333 bc0tl branch on coprocessor 0 true likely 26 31 cop0 010000 0 25 21 20 16 15 offset bc 01000 bctl 00011 format: bc0tl offset mips ii purpose: tests the cp0 condition code and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of cp0's condition (cpcond) line, as sampled during the previous instruction, is true, the target address is branched to with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bc0t instruction. operation: 32 t ? 1: condition cop0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t ? 1: condition cop0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 334 beq branch on equal 26 31 beq 000100 0 25 rs rt 21 20 16 15 offset format: beq rs, rt, offset mips i purpose: compares general-purpose registers and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. the contents of general-purpose register rs and the contents of general-purpose register rt are compared. if the two registers are equal, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] = gpr[rt]) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] = gpr[rt]) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 335 beql branch on equal likely 26 31 beql 010100 0 25 rs rt 21 20 16 15 offset format: beql rs, rt, offset mips ii purpose: compares general-purpose registers and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. the contents of general-purpose register rs and the contents of general-purpose register rt are compared. if the two registers are equal, the target address is branched to, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the beq instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] = gpr[rt]) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] = gpr[rt]) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 336 bgez branch on greater than or equal to zero 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bgez 00001 format: bgez rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of general-purpose register rs are zero or greater when compared to zero, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 337 bgezal branch on greater than or equal to zero and link 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bgezal 10001 format: bgezal rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition procedure call. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is stored in the link register, r31 . if the contents of general-purpose register rs are zero or greater when compared to zero, then the program branches to the target address, with a delay of one instruction. general-purpose register r31 should not be specified as general-purpose register rs . if register r31 is specified, restarting may be impossible due to the destruction of rs contents caused by storing a link address. even such instructions are executed, an exception does not result. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) gpr[31] pc + 8 t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) gpr[31] pc + 8 t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 338 bgezall branch on greater than or equal to zero and link likely (1/2) 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bgezall 10011 format: bgezall rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition procedure call. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is stored in the link register, r31 . if the contents of general-purpose register rs are zero or greater when compared to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. general-purpose register r31 should not be specified as general-purpose register rs . if register r31 is specified, restarting may be impossible due to the destruction of rs contents caused by storing a link address. even such instructions are executed, an exception does not result. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bgezal instruction.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 339 bgezall branch on greater than or equal to zero and link likely (2/2) operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) gpr[31] pc + 8 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) gpr[31] pc + 8 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 340 bgezl branch on greater than or equal to zero likely 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bgezl 00011 format: bgezl rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of general-purpose register rs are zero or greater when compared to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bgez instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 341 bgtz branch on greater than zero 26 31 bgtz 000111 0 25 rs 21 20 16 15 offset 0 00000 format: bgtz rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of general-purpose register rs are zero or greater when compared to zero, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) and (gpr[rs] 0 32 ) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) and (gpr[rs] 0 64 ) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 342 bgtzl branch on greater than zero likely 26 31 bgtzl 010111 0 25 rs 21 20 16 15 offset 0 00000 format: bgtzl rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. the contents of general-purpose register rs are compared to zero. if the contents of general-purpose register rs are greater than zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bgtz instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 0) and (gpr[rs] 0 32 ) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 0) and (gpr[rs] 0 64 ) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 343 blez branch on less than or equal to zero 26 31 blez 000110 0 25 rs 21 20 16 15 offset 0 00000 format: blez rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. the contents of general-purpose register rs are compared to zero. if the contents of general-purpose register rs are zero or smaller than zero, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) or (gpr[rs] = 0 32 ) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) or (gpr[rs] = 0 64 ) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 344 blezl branch on less than or equal to zero likely 26 31 blezl 010110 0 25 rs 21 20 16 15 offset 0 00000 format: blezl rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. the contents of general-purpose register rs is compared to zero. if the contents of general-purpose register rs are zero or smaller than zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the blez instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) or (gpr[rs] = 0 32 ) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) or (gpr[rs] = 0 64 ) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 345 bltz branch on less than zero 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bltz 00000 format: bltz rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of general-purpose register rs are smaller than zero, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 346 bltzal branch on less than zero and link 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bltzal 10000 format: bltzal rs, offset mips i purpose: tests a general-purpose register and executes a pc relative condition procedure call. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is stored in the link register, r31 . if the contents of general-purpose register rs are smaller than zero when compared to zero, then the program branches to the target address, with a delay of one instruction. general-purpose register r31 should not be specified as general-purpose register rs . if register r31 is specified, restarting may be impossible due to the destruction of rs contents caused by storing a link address. even such instructions are executed, an exception does not result. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) gpr[31] pc + 8 t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) gpr[31] pc + 8 t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 347 bltzall branch on less than zero and link likely (1/2) 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bltzall 10010 format: bltzall rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition procedure call. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. unconditionally, the address of the instruction after the delay slot is stored in the link register, r31 . if the contents of general-purpose register rs are smaller than zero when compared to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. general-purpose register r31 should not be specified as general-purpose register rs . if register r31 is specified, restarting may be impossible due to the destruction of rs contents caused by storing a link address. even such instructions are executed, an exception does not result. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bltzal instruction.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 348 bltzall branch on less than zero and link likely (2/2) operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) gpr[31] pc + 8 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) gpr[31] pc + 8 t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 349 bltzl branch on less than zero likely 26 31 regimm 000001 0 25 rs 21 20 16 15 offset bltzl 00010 format: bltz rs, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition procedure call. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the contents of general-purpose register rs are smaller than zero when compared to zero, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bltz instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] 31 = 1) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] 63 = 1) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 350 bne branch on not equal 26 31 bne 000101 0 25 rs rt 21 20 16 15 offset format: bne rs, rt, offset mips i purpose: tests a general-purpose register and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. the contents of general-purpose register rs and the contents of general-purpose register rt are compared. if the two registers are not equal, then the program branches to the target address, with a delay of one instruction. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] gpr[rt]) t + 1: if condition then pc pc + target endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] gpr[rt]) t + 1: if condition then pc pc + target endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 351 bnel branch on not equal likely 26 31 bnel 010101 0 25 rs rt 21 20 16 15 offset format: bnel rs, rt, offset mips ii purpose: tests a general-purpose register and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset, shifted left two bits and sign-extended. the contents of general-purpose register rs and the contents of general-purpose register rt are compared. if the two registers are not equal, then the program branches to the target address, with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bne instruction. operation: 32 t: target (offset 15 ) 14 || offset || 0 2 condition (gpr[rs] gpr[rt]) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif 64 t: target (offset 15 ) 46 || offset || 0 2 condition (gpr[rs] gpr[rt]) t + 1: if condition then pc pc + target else nullifycurrentinstruction endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 352 break breakpoint 26 31 special 000000 0 code 25 6 5 break 001101 format: break mips i purpose: generates a breakpoint exception. description: a breakpoint exception occurs, immediately and unconditionally transferring control to the exception handler. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: breakpointexception exceptions: breakpoint exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 353 cache cache operation (1/4) 26 31 cache 101111 0 25 base op 21 20 16 15 offset format: cache op, offset (base) mips iii description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the virtual address is translated to a physical address using the tlb, and the 5-bit sub-opcode specifies a cache operation for that address. if cp0 is not usable (user or supervisor mode) and the cp0 enable bit in the status register is clear, a coprocessor unusable exception is taken. the operation of this instruction on any operation/cache combination not listed below, or on a secondary cache that is not incorporated in v r 5500, is undefined. the operation of this instruction on uncached addresses is also undefined. the index operation uses part of the virtual address to specify a cache block. for a cache of 2 cachebits bytes with 2 linebits bytes per tag, vaddr cachebits...linebits specifies the block. the way of the cache is specified by using bit 0 of the virtual address. in hit, fill, and fetch_and_lock operations, the way of the cache is specified by using the lru bit of the cache tag. index_load_tag also uses vaddr linebits...3 to select the doubleword for reading parity. if the ce bit of the status register is set, vaddr linebits..3 is used for hit_write_back_invalidate, index_write_back_invalidate, and fill operations to select the doubleword that includes the modified parity. this operation is unconditionally executed. the hit operation accesses the specified cache as normal data references, and performs the specified operation if the cache block contains valid data with the specified physical address (a hit). if the cache block is invalid or contains a different address (a miss), no operation is performed.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 354 cache cache operation (2/4) write back from a cache goes to main memory. the main memory address to be written is specified by the cache tag and not the physical address translated using tlb. tlb refill and tlb invalid exceptions can occur on any operation. for index operations note for addresses in the unmapped areas, unmapped addresses may be used to avoid tlb exceptions. index operations never cause a tlb modified exception. note physical addresses here are used to index the cache, and they do not need to match the cache tag. bits 17 and 16 of the instruction code specify the cache for which the operation is to be performed as follows. op 1..0 name cache 0 1 2 3 i d ? ? instruction cache data cache reserved reserved bits 20 to 18 of this instruction specify the contents of cache operation. details are provided from the next page.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 355 cache cache operation (3/4) op 4..2 cache name operation 0 i index_invalidate set the cache state of the cache block to invalid. 0 d index_write_ back_invalidate examine the cache state of the data cache block at the index specified by the virtual address. if the state is dirty and not invalid, then write back the block to memory. the address to write is taken from the cache tag. set cache state of cache block to invalid. 1 i, d index_load_tag read the tag for the cache block at the specified index and place it into the taglo cp0 registers. at this time, a parity error is ignored. in addition, data is loaded from the doubleword for which the data parity was specified to the parity error register. 2 i, d index_store_ tag write the tag for the cache block at the specified index from the taglo cp0 register. 3 d create_dirty this operation is used to avoid loading data needlessly from memory when writing new contents to an entire cache block. if the cache block does not contain the specified address, and the block is dirty, write it back to the memory. in all cases, set the cache state to dirty. the specified physical address is set to the cache block tag in all cases and the cache status is set to dirty. 4 i, d hit_invalidate if the cache block contains the specified address, mark the cache block invalid. 5 i fill fill the instruction cache block from memory. if the ce bit of the status register is set, the contents of the ecc register is used instead of the computed parity bits for addressed doubleword when written to the instruction cache. 5 d hit_write_back invalidate if the cache block contains the specified address, write back the data if it is dirty, and mark the cache block invalid. 6 d hit_write_back if the cache block includes the specified address and if the cache status is dirty, data is written back to the main memory and the cache status of that cache block is set to clean. 7 i fetch_and_lock if the specified address is not included in the cache block, that block is filled with data from the main memory. in all cases, the specified physical address is set to the cache block tag and the cache status is locked. 7 d fetch_and_lock if the specified address is not included in the cache block and if that block is dirty, the data is written back and the block is filled with data from the main memory. in all cases, the specified physical address is set to the cache block tag and the cache status is locked.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 356 cache cache operation (4/4) operation: 32, 64 t: vaddr ((offset 15 ) 48 || offset 15..0 )+gpr[base] (paddr,uncached) addresstranslation (vaddr, data) cacheop (op, vaddr, paddr) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception bus error exception address error exception cache error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 357 clo count leading ones in word 26 31 special2 011100 0 rd 25 rs 21 20 16 15 11 10 0 00000 65 clo 100001 rt format: clo rd, rs v r 5500 purpose: counts the number of 1s in 32-bit data. description: this instruction scans the 32-bit contents of general-purpose register rs from the most significant bit toward the least significant bit, and stores the number of 1s in general-purpose register rd . if the value of register rs is all 1, 32 is stored in rd . in the 64-bit mode, the operand must be a sign-extended 32-bit value; otherwise the result will be undefined. specify the same register as general-purpose register rd for general-purpose register rt . operation: 32, 64 t: temp 32 for i in 31..0 if gpr[rs] i = 0 then temp 31 ? i break endif endfor gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 358 clz count leading zeros in word 26 31 special2 011100 0 rd 25 rs 21 20 16 15 11 10 0 00000 65 clz 100000 rt format: clz rd, rs v r 5500 purpose: counts the number of 0s in 32-bit data. description: this instruction scans the 32-bit contents of general-purpose register rs from the most significant bit toward the least significant bit, and stores the number of 0s in general-purpose register rd . if the value of register rs is all 0, 32 is stored in rd . in the 64-bit mode, the operand must be a sign-extended 32-bit value; otherwise the result will be undefined. specify the same register as general-purpose register rd for general-purpose register rt . operation: 32, 64 t: temp 32 for i in 31..0 if gpr[rs] i = 1 then temp 31 ? i break endif endfor gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 359 copz coprocessor z operation 26 31 copz 0100xx note 0 25 24 cofun co 1 format: copz cofun mips i purpose: executes a coprocessor instruction. description: this instruction executes a coprocessor instruction. this instruction can specify and reference an internal coprocessor register and can modify the status of the coprocessor. however, the status of the processor, cache, and main memory remains unchanged. for details of the coprocessor instructions, refer to chapter 18 fpu instruction set . operation: 32, 64 t: coprocessoroperation (z, cofun) exceptions: coprocessor unusable exception floating-point operation exception (cp1 only) note see the opcode table below, or 17.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 1 26 0 25 1 0 c op2 opcode coprocessor no. coprocessor sub-opcode 31 0 30 1 29 0 28 0 27 0 26 1 25 1 0 c op1 31 0 30 1 29 0 28 0 27 0 26 0 25 1 0 c op0 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 360 dadd doubleword add 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dadd 101100 format: dadd rd, rs, rt mips iii purpose: adds 64-bit integers. a trap is performed if an overflow occurs. description: the contents of general-purpose register rs and the contents of general-purpose register rt are added and the result is stored in general-purpose register rd . an integer overflow exception occurs if the carries out of bits 62 and 63 differ (2?s complement overflow). the destination register rd is not modified when an integer overflow exception occurs. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: gpr[rd] gpr[rs] + gpr[rt] remark the operation is the same in the 32-bit kernel mode. exceptions: integer overflow exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 361 daddi doubleword add immediate 26 31 daddi 011000 0 25 rs rt 21 20 16 15 immediate format: daddi rt, rs, immediate mips iii purpose: adds a 64-bit integer to a constant. a trap is performed if an overflow occurs. description: the 16-bit immediate is sign-extended and added to the contents of general-purpose register rs and the result is stored in general-purpose register rt. an integer overflow exception occurs if carries out of bits 62 and 63 differ (2?s complement overflow). the destination register rt is not modified when an integer overflow exception occurs. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: gpr[rt] gpr[rs] + (immediate 15 ) 48 || immediate 15..0 remark the operation is the same in the 32-bit kernel mode. exceptions: integer overflow exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 362 daddiu doubleword add immediate unsigned 26 31 daddiu 011001 0 25 rs rt 21 20 16 15 immediate format: daddiu rt, rs, immediate mips iii purpose: adds a 64-bit integer to a constant. description: the 16-bit immediate is sign-extended and added to the contents of general-purpose register rs and the result is stored in general-purpose register rt. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. the only difference between this instruction and the daddi instruction is that daddiu never causes an integer overflow exception. operation: 64 t: gpr[rt] gpr[rs] + (immediate 15 ) 48 || immediate 15..0 remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 363 daddu doubleword add unsigned 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 daddu 101101 format: daddu rd, rs, rt mips iii purpose: adds 64-bit integers. description: the contents of general-purpose register rs and the contents of general-purpose register rt are added and the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. the only difference between this instruction and the dadd instruction is that daddu never causes an integer overflow exception. operation: 64 t: gpr[rd] gpr[rs] + gpr[rt] remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 364 dclo count leading ones in doubleword 26 31 special2 011100 0 rd 25 rs 21 20 16 15 11 10 0 00000 65 dclo 100101 rt format: dclo rd, rs v r 5500 purpose: counts the number of 1s in 64-bit data. description: this instruction scans the 64-bit contents of general-purpose register rs from the most significant bit toward the least significant bit, and stores the number of 1s in general-purpose register rd . if the value of register rs is all 1, 64 is stored in rd . specify the same register as general-purpose register rd for general-purpose register rt . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: temp 64 for i in 63..0 if gpr[rs] i = 0 then temp 63 ? i break endif endfor gpr[rd] (temp 31 ) 32 || temp remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 365 dclz count leading zeros in doubleword 26 31 special2 011100 0 rd 25 rs 21 20 16 15 11 10 0 00000 65 dclz 100100 rt format: dclz rd, rs v r 5500 purpose: counts the number of 0s in 64-bit data. description: this instruction scans the 64-bit contents of general-purpose register rs from the most significant bit toward the least significant bit, and stores the number of 0s in general-purpose register rd . if the value of register rs is all 0, 64 is stored in rd . specify the same register as general-purpose register rd for general-purpose register rt . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: temp 64 for i in 63..0 if gpr[rs] i = 1 then temp 63 ? i break endif endfor gpr[rd] (temp 31 ) 32 || temp remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 366 ddiv doubleword divide 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 ddiv 011110 format: ddiv rs, rt mips iii purpose: divides a 64-bit signed integer. description: the contents of general-purpose register rs are divided by the contents of general-purpose register rt, treating both operands as signed values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. this instruction is typically followed by additional instructions to check for a zero divisor and for overflow. when the operation completes, the quotient word of the double result is loaded to special register lo , and the remainder word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the ddiv instruction. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo gpr[rs] div gpr[rt] hi gpr[rs] mod gpr[rt] remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 367 ddivu doubleword divide unsigned 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 ddivu 011111 format: ddivu rs, rt mips iii purpose: divides a 64-bit unsigned integer. description: the contents of general-purpose register rs are divided by the contents of general-purpose register rt, treating both operands as unsigned values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. this instruction may be followed by additional instructions to check for a zero divisor, inserted by the programmer. when the operation completes, the quotient word of the double result is loaded to special register lo , and the remainder word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the ddivu instruction. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo (0 || gpr[rs]) div (0 || gpr[rt]) hi (0 || gpr[rs]) mod (0 || gpr[rt]) remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 368 div divide 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 div 011010 format: div rs, rt mips i purpose: divides a 32-bit signed integer. description: the contents of general-purpose register rs are divided by the contents of general-purpose register rt, treating both operands as signed values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. in 64-bit mode, the operands must be valid sign-extended, 32-bit values. this instruction is typically followed by additional instructions to check for a zero divisor and for overflow. when the operation completes, the quotient word of the double result is loaded to special register lo , and the remainder word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the ddiv instruction. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo gpr[rs] div gpr[rt] hi gpr[rs] mod gpr[rt] 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: q gpr[rs] 31..0 div gpr[rt] 31..0 r gpr[rs] 31..0 mod gpr[rt] 31..0 lo (q 31 ) 32 || q 31..0 hi (r 31 ) 32 || r 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 369 divu divide unsigned 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 divu 011011 format: divu rs, rt mips i purpose: divides a 32-bit unsigned integer. description: the contents of general-purpose register rs are divided by the contents of general-purpose register rt, treating both operands as unsigned values. no integer overflow exception occurs under any circumstances, and the result of this operation is undefined when the divisor is zero. in 64-bit mode, the operands must be valid sign- extended, 32-bit values. this instruction is typically followed by additional instructions to check for a zero divisor. when the operation completes, the quotient word of the double result is loaded to special register lo , and the remainder word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of those instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the ddiv instruction. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: lo (0 || gpr[rs]) div (0 || gpr[rt]) hi (0 || gpr[rs]) mod (0 || gpr[rt]) 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: q (0 || gpr[rs] 31..0 ) div (0 || gpr[rt] 31..0 ) r (0 || gpr[rs] 31..0 ) mod (0 || gpr[rt] 31..0 ) lo (q 31 ) 32 || q 31..0 hi (r 31 ) 32 || r 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 370 dmfc0 doubleword move from system control coprocessor 26 31 cop0 010000 0 rd 25 rt 21 20 16 15 11 10 0 00000000000 dmf 00001 format: dmfc0 rt, rd mips iii description: the contents of coprocessor register rd of the cp0 are loaded to general-purpose register rt. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. the contents of the coprocessor register rd source are written to the 64-bit general-purpose register rt destination. the operation of dmfc0 on a 32-bit coprocessor 0 register is undefined. operation: 64 t: data cpr[0, rd] t + 1: gpr[rt] data remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception (64-/32-bit user/supervisor mode if cp0 is disabled) reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 371 dmtc0 doubleword move to system control coprocessor 26 31 cop0 010000 0 rd 25 rt 21 20 16 15 11 10 0 00000000000 dmt 00101 format: dmtc0 rt, rd mips iii description: the contents of general-purpose register rt are loaded to coprocessor register rd of the cp0. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. the contents of the general-purpose register rd source are written to the 64-bit coprocessor register rt destination. the operation of dmtc0 on a 32-bit coprocessor 0 register is undefined. because the state of the virtual address translation system may be altered by this instruction, the operation of load instructions, store instructions, and tlb operations immediately prior to and after this instruction are undefined. operation: 64 t: data gpr[rt] t + 1: cpr[0, rd] data remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception (64-/32-bit user/supervisor mode if cp0 is disabled) reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 372 dmult doubleword multiply 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 dmult 011100 format: dmult rs, rt mips iii purpose: multiply 64-bit signed integers. description: the contents of general-purpose registers rs and rt are multiplied, treating both operands as signed values. no integer overflow exception occurs under any circumstances. when the operation completes, the lower word of the double result is loaded to special register lo , and the higher word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the dmult instruction. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr[rs] * gpr[rt] lo t 63..0 hi t 127..64 remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 373 dmultu doubleword multiply unsigned 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 dmultu 011101 format: dmultu rs, rt mips iii purpose: multiply 64-bit unsigned integers. description: the contents of general-purpose registers rs and rt are multiplied, treating both operands as unsigned values. no integer overflow exception occurs under any circumstances. when the operation completes, the lower word of the double result is loaded to special register lo , and the higher word of the double result is loaded to special register hi . if either of the two preceding instructions is mfhi or mflo, the results of these instructions are undefined. to obtain the correct result, insert two or more instructions between the mfhi or mflo instruction and the dmultu instruction. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr[rs]) * (0 || gpr[rt]) lo t 63..0 hi t 127..64 remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 374 dror doubleword rotate right 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 dror 111010 1 00001 sa format: dror rd, rt, sa v r 5500 purpose: arithmetically shifts a doubleword to the right by the specific number of bits (0 to 31 bits). description: this instruction shifts the contents of general-purpose register rt to the right by the number of bits specified by sa . the lower bit that is shifted out is inserted in the higher bit. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: gpr[rd] gpr[rt] sa ? 1..0 || gpr[rt] 63..sa remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 375 dror32 doubleword rotate right + 32 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 dror32 111110 1 00001 sa format: dror32 rd, rt, sa v r 5500 purpose: arithmetically shifts a doubleword to the right by the specific number of bits (32 to 63 bits). description: this instruction shifts the contents of general-purpose register rt 32 + sa bits to the right. the lower bit that is shifted out is inserted in the higher bit. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 32, 64 t: s sa + 32 gpr[rd] gpr[rt] s ? 1..0 || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 376 drorv doubleword rotate right variable 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 drorv 010110 1 00001 rs format: drorv rd, rt, rs v r 5500 purpose: arithmetically shifts a doubleword to the right by the specified number of bits. description: this instruction shifts the contents of general-purpose register rt to the right by the number of bits specified by the lower 5 bits of general-purpose register rs . the lower bit that is shifted out is inserted in the higher bit. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 32, 64 t: s gpr[rs] 4..0 gpr[rd] gpr[rt] s ? 1..0 || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 377 dsll doubleword shift left logical 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsll 111000 sa format: dsll rd, rt, sa mips iii purpose: shifts a doubleword to the left by the specific number of bits (0 to 31 bits). description: the contents of general-purpose register rt are shifted left by the number of bits specified by sa , inserting zeros into the lower bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 0 || sa gpr[rd] gpr[rt] (63 ? s)..0 || 0 s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 378 dsll32 doubleword shift left logical + 32 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsll32 111100 sa format: dsll32 rd, rt, sa mips iii purpose: shifts a doubleword to the left by the specific number of bits (32 to 63 bits). description: the contents of general-purpose register rt are shifted left by 32 + sa bits, inserting zeros into the lower bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 1 || sa gpr[rd] gpr[rt] (63 ? s)..0 || 0 s exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 379 dsllv doubleword shift left logical variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dsllv 010100 format: dsllv rd, rt, rs mips iii purpose: shifts a doubleword to the left by the specified number of bits. description: the contents of general-purpose register rt are shifted left by the number of bits specified by the lower 6 bits contained in general-purpose register rs , inserting zeros into the lower bits. the result is stored in general- purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s gpr[rs] 5..0 gpr[rd] gpr[rt] (63 ? s)..0 || 0 s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 380 dsra doubleword shift right arithmetic 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsra 111011 sa format: dsra rd, rt, sa mips iii purpose: arithmetically shifts a doubleword to the right by the specific number of bits (0 to 31 bits). description: the contents of general-purpose register rt are shifted right by sa bits, sign-extending the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 0 || sa gpr[rd] (gpr[rt] 63 ) s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 381 dsra32 doubleword shift right arithmetic + 32 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsra32 111111 sa format: dsra32 rd, rt, sa mips iii purpose: arithmetically shifts a doubleword to the right by the specific number of bits (32 to 63 bits). description: the contents of general-purpose register rt are shifted right by 32 + sa bits, sign-extending the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 1 || sa gpr[rd] (gpr[rt] 63 ) s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 382 dsrav doubleword shift right arithmetic variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dsrav 010111 format: dsrav rd, rt, rs mips iii purpose: arithmetically shifts a doubleword to the right by the specified number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by the lower 6 bits of general-purpose register rs , sign-extending the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s gpr[rs] 5..0 gpr[rd] (gpr[rt] 63 ) s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 383 dsrl doubleword shift right logical 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsrl 111010 sa format: dsrl rd, rt, sa mips iii purpose: logically shifts a doubleword to the right by the specific number of bits (0 to 31 bits). description: the contents of general-purpose register rt are shifted right by sa bits, inserting zeros into the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 0 || sa gpr[rd] 0 s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 384 dsrl32 doubleword shift right logical + 32 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 dsrl32 111110 sa format: dsrl32 rd, rt, sa mips iii purpose: logically shifts a doubleword to the right by the specific number of bits (32 to 63 bits). description: the contents of general-purpose register rt are shifted right by 32 + sa bits, inserting zeros into the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s 1 || sa gpr[rd] 0 s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 385 dsrlv doubleword shift right logical variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dsrlv 010110 format: dsrlv rd, rt, rs mips iii purpose: logically shifts a doubleword to the right by the specified number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by the lower 6 bits of general-purpose register rs, inserting zeros into the higher bits. the result is stored in general-purpose register rd . this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: s gpr[rs] 5..0 gpr[rd] 0 s || gpr[rt] 63..s remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 386 dsub doubleword subtract 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dsub 101110 format: dsub rd, rs, rt mips iii purpose: subtract a 64-bit integer. a trap is performed if an overflow occurs. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs and the result is stored in general-purpose register rd. an integer overflow exception takes place if the carries out of bits 62 and 63 differ (2's complement overflow). the destination register rd is not modified when an integer overflow exception occurs. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: gpr[rd] gpr[rs] ? gpr[rt] remark the operation is the same in the 32-bit kernel mode. exceptions: integer overflow exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 387 dsubu doubleword subtract unsigned 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 dsubu 101111 format: dsubu rd, rs, rt mips iii purpose: subtract a 64-bit integer. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs and the result is stored in general-purpose register rd . the only difference between this instruction and the dsub instruction is that dsubu never causes an integer overflow. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: gpr[rd] gpr[rs] ? gpr[rt] remark the operation is the same in the 32-bit kernel mode. exceptions: reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 388 eret return from exception 26 31 cop0 010000 0 25 co 1 24 65 eret 011000 0 0000000000000000000 format: eret mips iii description: the eret instruction is the instruction for returning from an interrupt, exception, or error exception. unlike a branch or jump instruction, eret does not execute the next instruction. the eret instruction must not be placed in a branch delay slot. if the erl bit of the status register is set ( sr 2 = 1), the contents of the errorepc register are loaded to the pc and the erl bit is cleared ( sr 2 ). otherwise ( sr 2 = 0), the contents of the pc are loaded from the epc register, and the exl bit of the status register is cleared ( sr 1 = 0). because the ll bit is cleared by the eret instruction, an execution of eret between the ll, lld instructions and sc, sd instructions causes the sc instruction to fail. operation: 32, 64 t: if sr 2 = 1 then pc errorepc sr sr 31..3 || 0 || sr 1..0 else pc epc sr sr 31..2 || 0 || sr 0 endif llbit 0 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 389 j jump 26 31 j 000010 0 target 25 format: j target mips i purpose: executes a branch in the area (256 mb) currently aligned. description: the 26-bit target address is shifted left two bits and combined with the higher 4 bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. operation: 32 t: temp target t + 1: pc pc 31..28 || temp || 0 2 64 t: temp target t + 1: pc pc 63..28 || temp || 0 2 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 390 jal jump and link 26 31 jal 000011 0 target 25 format: jal target mips i purpose: executes a procedure call in the area (256 mb) currently aligned. description: the 26-bit target address is shifted left two bits and combined with the higher 4 bits of the address of the delay slot. the program unconditionally jumps to this calculated address with a delay of one instruction. the address of the instruction immediately after a delay slot is placed in the link register (r31). operation: 32 t: temp target gpr[31] pc + 8 t + 1: pc pc 31..28 || temp || 0 2 64 t: temp target gpr[31] pc + 8 t + 1: pc pc 63..28 || temp || 0 2 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 391 jalr jump and link register 26 31 special 000000 0 rd 25 rs 21 20 16 15 11 10 0 00000 65 jalr 001001 0 00000 format: jalr rs jalr rd, rs mips i purpose: executes a procedure call to an instruction address in a register. description: the program unconditionally jumps to the address contained in general-purpose register rs with a delay of one instruction. the address of the instruction immediately after the delay slot is placed in general-purpose register rd . the default value of rd , if omitted in the assembly language instruction, is 31. register numbers rs and rd may not be equal, because such an instruction does not have the same effect when re-executed. because storing a link address destroys the contents of rs if they are equal. even such instructions are execute, an exception does not result, and the result of executing such an instruction is undefined. the effective target address of general-purpose register rs must be aligned. if the lower 2 bits are not zero, an address error exception will occur when the jump target instruction is subsequently fetched. operation: 32, 64 t: temp gpr[rs] gpr[rd] pc + 8 t + 1: pc temp exceptions: address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 392 jr jump register 26 31 special 000000 0 25 rs 21 20 0 000000000000000 65 jr 001000 format: jr rs mips i description: the program unconditionally jumps to the address contained in general-purpose register rs with a delay of one instruction. the effective target address of general-purpose register rs must be aligned. if the lower 2 bits are not zero, an address error exception will occur when the jump target instruction is subsequently fetched. operation: 32, 64 t: temp gpr[rs] t + 1: pc temp exceptions: address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 393 lb load byte 26 31 lb 100000 0 25 base rt 21 20 16 15 offset format: lb rt, offset (base) mips i purpose: loads 1 byte from memory as a signed value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the byte at the memory location specified by the effective address are sign- extended and loaded to general-purpose register rt . operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2..0 xor bigendiancpu 3 gpr[rt] (mem 7+8*byte ) 24 || mem 7+8*byte..8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr,data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2..0 xor bigendiancpu 3 gpr[rt] (mem 7+8*byte ) 56 || mem 7+8*byte..8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 394 lbu load byte unsigned 26 31 lbu 100100 0 25 base rt 21 20 16 15 offset format: lbu rt, offset (base) mips i purpose: loads 1 byte from memory as an unsigned value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the byte at the memory location specified by the effective address are zero- extended and loaded to general-purpose register rt . operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr,data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2..0 xor bigendiancpu 3 gpr[rt] 0 24 || mem 7+8*byte..8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr,data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) mem loadmemory (uncached, byte, paddr, vaddr, data) byte vaddr 2..0 xor bigendiancpu 3 gpr[rt] 0 56 || mem 7+8*byte..8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 395 ld load doubleword 26 31 ld 110111 0 25 base rt 21 20 16 15 offset format: ld rt, offset (base) mips iii purpose: loads a doubleword from memory. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the 64-bit doubleword at the memory location specified by the effective address are loaded to general-purpose register rt . an address error exception occurs if the lower 3 bits of the effective address are not 0. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, doubleword, paddr, vaddr, data) gpr[rt] mem remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode. exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 396 ldcz load doubleword to coprocessor z (1/2) 26 31 ldcz 1101xx n ote 0 25 base rt 21 20 16 15 offset format: ldcz rt, offset (base) mips ii purpose: loads a doubleword from memory to the coprocessor general-purpose register. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the doubleword at the memory location specified by the effective address are loaded to cpz register rt . how to use data is defined for each processor. an address error exception occurs if the lower 3 bits of the address are not 0. this instruction set to cp0 is invalid. if cp1 is specified and the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a general- purpose register. if an odd number is specified, the operation is undefined. if the fr bit of the status bit is 1, both odd and even register numbers are valid. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, doubleword, paddr, vaddr, data) copzld (rt, mem) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, doubleword, paddr, vaddr, data) copzld (rt, mem) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding .
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 397 ldcz load doubleword to coprocessor z (2/2) opcode table: 31 1 30 1 29 0 28 1 27 1 26 0 0 ldc2 opcode coprocessor no. 31 1 30 1 29 0 28 1 27 0 26 1 0 ldc1 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 398 ldl load doubleword left (1/3) 31 25 26 20 21 15 16 0 ldl 011010 base rt offset format: ldl rt, offset (base) mips iii purpose: loads the most significant part of a doubleword from unaligned memory. description: this instruction can be used in combination with the ldr instruction when loading a doubleword data in the memory that does not exist at a doubleword boundary to general-purpose register rt . the ldl instruction loads the higher word of the data, and the ldr instruction loads the lower word of the data to the register. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address that can specify an arbitrary byte. among the doubleword data in the memory whose most significant byte is the byte specified by the virtual address, only data at the same doubleword boundary as the target address is loaded and stored in the higher portion of general-purpose register rt . other bits in general- purpose register rt will not be changed. the number of bytes to be loaded varies from one to eight depending on the byte specified. in other words, the byte specified by the virtual address is stored in the most significant byte of general-purpose register rt . as long as there are lower bytes among the bytes at the same doubleword boundary, the operation to store the byte in the next byte of general-purpose register rt will be continued. the lower byte of the register will not be changed. 15 7 14 6 13 5 12 4 11 3 10 2 9 1 8 0 address 8 address 0 memory (little endian) before load a b c d e f g h $24 after load 12 11 10 9 8 f g h $24 register ldl $24,12 ($0)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 399 ldl load doubleword left (2/3) the contents of general-purpose register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following ldl (or ldr) instruction which also specifies register rt . an address error exception caused by the specified address not being aligned at a doubleword boundary does not occur. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1..3 || 0 3 endif byte vaddr 2..0 xor bigendiancpu 3 mem loadmemory (uncached, byte, paddr, vaddr, data) gpr[rt] mem 7+8*byte..0 || gpr[rt] 55 ? 8*byte..0 remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 400 ldl load doubleword left (3/3) the relationship between the address assigned to the ldl instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 pbcdefgh opcdefgh nopdefgh mnopefgh lmnopfgh klmnopgh jklmnoph ijklmnop 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 7 6 5 4 3 2 1 0 ijklmnop jklmnoph klmnopgh lmnopfgh mnopefgh nopdefgh opcdefgh pbcdefgh 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 remark type accesstype (see figure 3-3 byte specification related to load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 401 ldr load doubleword right (1/3) 31 25 26 20 21 15 16 0 ldr 011011 base rt offset format: ldr rt, offset (base) mips iii purpose: loads the least significant part of a doubleword from unaligned memory. description: this instruction can be used in combination with the ldl instruction when loading a doubleword data in the memory that does not exist at a doubleword boundary to general-purpose register rt . the ldl instruction loads the higher word of the data, and the ldr instruction loads the lower word of the data to the register. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address that can specify an arbitrary byte. among the doubleword data in the memory whose least significant byte is the byte specified by the virtual address, only data at the same doubleword boundary as the target address is loaded and stored in the lower portion of general-purpose register rt . other bits in general- purpose register rt will not be changed. the number of bytes to be loaded varies from one to eight depending on the byte specified. in other words, the byte specified by the virtual address is stored in the least significant byte of general-purpose register rt . as long as there are higher bytes among the bytes at the same doubleword boundary, the operation to store the byte in the next byte of general-purpose register rt will be continued. the higher byte of the register will not be changed. 15 7 14 6 13 5 12 4 11 3 10 2 9 1 8 0 address 8 address 0 memory (little endian) before load a b c d e f g h $24 after load a b c d e 7 6 5 $24 register ldr $24,5 ($0)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 402 ldr load doubleword right (2/3) the contents of general-purpose register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following ldr (or ldl) instruction which also specifies register rt . an address error exception caused by the specified address not being aligned at a doubleword boundary does not occur. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1..3 || 0 3 endif byte vaddr 2..0 xor bigendiancpu 3 mem loadmemory (uncached, doubleword - byte, paddr, vaddr, data) gpr[rt] gpr[rt] 63..64 ? 8*byte || mem 63..8*byte remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 403 ldr load doubleword right (3/3) the relationship between the address assigned to the ldr instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 ijklmnop aijklmno abijklmn abcijklm abcdijkl abcdeijk abcdefij abcdefgi 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 abcdefgi abcdefij abcdeijk abcdijkl abcijklm abijklmn aijklmno ijklmnop 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 remark type accesstype (see figure 3-3 byte specification related to load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 404 lh load halfword 26 31 lh 100001 0 25 base rt 21 20 16 15 offset format: lh rt, offset (base) mips i purpose: loads a halfword from memory as a signed value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the halfword at the memory location specified by the effective address are sign- extended and loaded to general-purpose register rt . an address error exception occurs if the least-significant bit of the address is not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian 2 || 0 )) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu 2 || 0) gpr[rt] (mem 15+8*byte ) 16 || mem 15+8*byte..8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian 2 || 0 )) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu 2 || 0) gpr[rt] (mem 15+8*byte ) 48 || mem 15+8*byte..8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 405 lhu load halfword unsigned 26 31 lhu 100101 0 25 base rt 21 20 16 15 offset format: lhu rt, offset (base) mips i purpose: loads a halfword from memory as an unsigned value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the halfword at the memory location specified by the effective address are zero- extended and loaded to general-purpose register rt . an address error exception occurs if the least-significant bit of the address is not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu 2 || 0) gpr[rt] 0 16 || mem 15+8*byte..8*byte 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian 2 || 0)) mem loadmemory (uncached, halfword, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu 2 || 0) gpr[rt] 0 48 || mem 15+8*byte..8*byte exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 406 ll load linked (1/2) 26 31 ll 110000 0 25 base rt 21 20 16 15 offset format: ll rt, offset (base) mips ii purpose: loads a word from memory for atomic read-modify-write. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. it loads the contents of a word from the memory at a specified address to general- purpose register rt . in the 64-bit mode, the loaded word is sign-extended. in addition, the physical address of the specified memory is stored in the lladdr register and the ll bit is set to 1. after that, the processor checks if the address stored in the lladdr register has been rewritten by another processor or device. updating memory in a multi-processor system can be accurately performed by using the ll and sc instructions. these instructions are used as shown in the following example. l1: ll t1, (t0) addi t2, t1, 1 sc t2, (t0) beq t2, 0, l1 nop in this example, the word addressed by t0 is automatically incremented. by replacing the addi instruction with the ori instruction, the bit is automatically set. this instruction can be used in all the modes and it is not necessary to enable cp0. this instruction is defined to maintain compatibility with the other v r series processors.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 407 ll load linked (2/2) the operation of the ll instruction is undefined if the specified address is in an uncached area. a cache miss that may occur between the ll and sc instructions prevents execution of the sc instruction. therefore, do not use a load or store instruction between the ll and sc instructions. otherwise, the operation of the sc instruction will not be guaranteed. if exceptions often occur, exceptions must be temporarily disabled because they also prevent execution of the sc instruction. an address error exception occurs if the lower 2 bits of the address are not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, word, paddr, vaddr, data) gpr[rt] mem llbit 1 lladdr paddr 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, word, paddr, vaddr, data) gpr[rt] mem llbit 1 lladdr paddr exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 408 lld load linked doubleword (1/2) 26 31 lld 110100 0 25 base rt 21 20 16 15 offset format: lld rt, offset (base) mips iii purpose: loads a doubleword from memory for atomic read-modify-write. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. it loads the contents of a doubleword from the memory at a specified address to general-purpose register rt . in addition, the physical address of the specified memory is stored in the lladdr register and the ll bit is set to 1. after that, the processor checks if the address stored in the lladdr register has been rewritten by another processor or device. updating memory in a multi-processor system can be accurately performed by using the lld and scd instructions. these instructions are used as shown in the following example. l1: lld t1, (t0) daddi t2, t1, 1 scd t2, (t0) beq t2, 0, l1 nop in this example, the doubleword addressed by t0 is automatically incremented. by replacing the daddi instruction with the ori instruction, the bit is automatically set. this instruction is defined to maintain compatibility with the other v r series processors.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 409 lld load linked doubleword (2/2) the operation of the lld instruction is undefined if the specified address is in an uncached area. a cache miss that may occur between the lld and scd instructions prevents execution of the scd instruction. therefore, do not use a load or store instruction between the lld and scd instructions. otherwise, the operation of the scd instruction will not be guaranteed. if exceptions often occur, exceptions must be temporarily disabled because they also prevent execution of the scd instruction. an address error exception occurs if the lower 3 bits of the address are not 0. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, doubleword, paddr, vaddr, data) gpr[rt] mem llbit 1 lladdr paddr 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, doubleword, paddr, vaddr, data) gpr[rt] mem llbit 1 lladdr paddr remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode. exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 410 lui load upper immediate 26 31 lui 001111 0 25 rt 21 20 16 15 immediate 0 00000 format: lui rt, immediate mips i purpose: loads a constant to the upper half of a word. description: the 16-bit immediate is shifted left 16 bits and concatenated to 16 bits of zeros. the result is stored in general- purpose register rt . in 64-bit mode, the loaded word is sign-extended. operation: 32 t: gpr[rt] immediate || 0 16 64 t: gpr[rt] (immediate 15 ) 32 || immediate || 0 16 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 411 lw load word 26 31 lw 100011 0 25 base rt 21 20 16 15 offset format: lw rt, offset (base) mips i purpose: loads a word from memory as a signed value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the word at the memory location specified by the effective address are loaded to general-purpose register rt . in 64-bit mode, the loaded word is sign-extended. an address error exception occurs if the lower 2 bits of the address are not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, word, paddr, vaddr, data) gpr[rt] mem 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, word, paddr, vaddr, data) gpr[rt] mem exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 412 lwcz load word to coprocessor z (1/2) 26 31 lwcz 1100xx n ote 0 25 base rt 21 20 16 15 offset format: lwcz rt, offset (base) mips i purpose: loads a word from memory to the coprocessor general-purpose register. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the word at the memory location specified by the effective address are loaded to cpz register rt . how to use data is defined for each processor. an address error exception occurs if the lower 2 bits of the address are not 0. this instruction set to cp0 is invalid. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu || 0 2 ) copzlw (byte, rt, mem) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian || 0 2 )) mem loadmemory (uncached, word, paddr, vaddr, data) byte vaddr 2..0 xor (bigendiancpu || 0 2 ) copzlw (byte, rt, mem) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding .
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 413 lwcz load word to coprocessor z (2/2) opcode table: 31 1 30 1 29 0 28 0 27 1 26 0 0 lwc2 opcode coprocessor no. 31 1 30 1 29 0 28 0 27 0 26 1 0 lwc1 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 414 lwl load word left (1/3) 31 25 26 20 21 15 16 0 lwl 100010 base rt offset format: lwl rt, offset (base) mips i purpose: loads the most significant part of a word from unaligned memory. description: this instruction can be used in combination with the lwr instruction when loading a word data in the memory that does not exist at a word boundary to general-purpose register rt . the lwl instruction loads the higher word of the data, and the lwr instruction loads the lower word of the data to the register. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address that can specify an arbitrary byte. among the word data in the memory whose most significant byte is the byte specified by the virtual address, only data at the same word boundary as the target address is loaded and stored in the higher portion of general-purpose register rt . other bits in general-purpose register rt will not be changed. the number of bytes to be loaded varies from one to four depending on the byte specified. in other words, the byte specified by the virtual address is stored in the most significant byte of general-purpose register rt . as long as there are lower bytes among the bytes at the same word boundary, the operation to store the byte in the next byte of general-purpose register rt will be continued. the lower byte of the register will not be changed. 7 3 6 2 5 1 4 0 address 4 address 0 memory (little endian) before load ab cd $24 after load $24 register lwl $24,4 ($0) 4bcd
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 415 lwl load word left (2/3) the contents of general-purpose register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following lwl (or lwr) instruction which also specifies register rt . an address error exception caused by the specified address not being aligned at a word boundary does not occur. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, byte, paddr, vaddr, data) temp mem 32*word+8*byte+7..32*word || gpr[rt] 23 ? 8*byte..0 gpr[rt] temp 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, byte, paddr, vaddr, data) temp mem 32*word+8*byte+7..32*word || gpr[rt] 23 ? 8*byte..0 gpr[rt] (temp 31 ) 32 || temp
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 416 lwl load word left (3/3) the relationship between the address assigned to the lwl instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 sssspfgh ssssopgh ssssnoph ssssmnop sssslfgh ssssklgh ssssjklh ssssijkl 0 1 2 3 0 1 2 3 0 0 0 0 4 4 4 4 7 6 5 4 3 2 1 0 ssssijkl ssssjklh ssssklgh sssslfgh ssssmnop ssssnoph ssssopgh sssspfgh 3 2 1 0 3 2 1 0 4 4 4 4 0 0 0 0 0 1 2 3 4 5 6 7 remark type accesstype (see figure 3-3 byte specification related to load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) s bit 31 of destination sign-extended exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 417 lwr load word right (1/3) 31 25 26 20 21 15 16 0 lwr 100110 base rt offset format: lwr rt, offset (base) mips i purpose: loads the least significant part of a word from unaligned memory. description: this instruction can be used in combination with the lwl instruction when loading a word data in the memory that does not exist at a word boundary to general-purpose register rt . the lwl instruction loads the higher word of the data, and the lwr instruction loads the lower word of the data to the register. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address that can specify an arbitrary byte. among the word data in the memory whose least significant byte is the byte specified by the virtual address, only data at the same word boundary as the target address is loaded and stored in the lower portion of general-purpose register rt . other bits in general-purpose register rt will not be changed. the number of bytes to be loaded varies from one to four depending on the byte specified. in other words, the byte specified by the virtual address is stored in the least significant byte of general-purpose register rt . as long as there are higher bytes among the bytes at the same word boundary, the operation to store the byte in the next byte of general-purpose register rt will be continued. 7 3 6 2 5 1 4 0 address 4 address 0 memory (little endian) before load ab cd $24 after load $24 register lwr $24,1 ($0) a32 1
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 418 lwr load word right (2/3) the contents of general-purpose register rt are internally bypassed within the processor so that no nop is needed between an immediately preceding load instruction which specifies register rt and a following lwr (or lwl) instruction which also specifies register rt . an address error exception caused by the specified address not being aligned at a word boundary does not occur. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1..3 || 0 3 endif byte vaddr 1..0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, 0 || byte, paddr, vaddr, data) temp gpr[rt] 31..32 ? 8*byte || mem 31+32*word ..32*word+ 8*byte gpr[rt] temp 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 1 then paddr paddr psize ? 1..3 || 0 3 endif byte vaddr 1..0 xor bigendiancpu 2 word vaddr 2 xor bigendiancpu mem loadmemory (uncached, 0 || byte, paddr, vaddr, data) temp gpr[rt] 31..32 ? 8*byte || mem 31+32*word ..32*word+ 8*byte gpr[rt] (temp 31 ) 32 || temp
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 419 lwr load word right (3/3) the relationship between the address assigned to the lwr instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 ssssmnop xxxxemno xxxxefmn xxxxefgm ssssijkl xxxxeijk xxxxefij xxxxefgi 3 2 1 0 3 2 1 0 0 1 2 3 4 5 6 7 4 4 4 4 0 0 0 0 xxxxefgi xxxxefij xxxxeijk ssssijkl xxxxefgm xxxxefmn xxxxemno ssssmnop 0 1 2 3 0 1 2 3 7 6 5 4 3 2 1 0 0 0 0 0 4 4 4 4 remark type accesstype (see figure 3-3 byte specification related to load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) s bit 31 of destination sign-extended x no change (32-bit mode) bit 31 of destination sign-extended (64-bit mode) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 420 lwu load word unsigned 26 31 lwu 100111 0 25 base rt 21 20 16 15 offset format: lwu rt, offset (base) mips iii purpose: loads a word from memory as an unsigned value. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the word at the memory location specified by the effective address are loaded to general-purpose register rt . the loaded word is zero-extended. an address error exception occurs if the lower 2 bits of the address are not 0. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) mem loadmemory (uncached, word, paddr, vaddr, data) gpr[rt] 0 32 || mem remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode. exceptions: tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 421 macc multiply, accumulate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 macc 00101011000 rs format: macc rd, rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the contents of this accumulator are added to the result of the multiplication as a 64-bit signed integer, and the result is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 422 macchi multiply, accumulate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 macchi 01101011000 rs format: macchi rd, rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the contents of this accumulator are added to the result of the multiplication as a 64-bit signed integer, and the result is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 423 macchiu unsigned multiply, accumulate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 macchiu 01101011001 rs format: macchiu rd, rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the contents of this accumulator are added to the result of the multiplication as a 64-bit unsigned integer, and the result is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 424 maccu unsigned multiply, accumulate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 maccu 00101011001 rs format: maccu rd, rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the contents of this accumulator are added to the result of the multiplication as a 64-bit unsigned integer, and the result is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) + (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 425 madd multiply and add word 26 31 special2 011100 0 25 rs rt 21 20 16 15 0 0000000000 65 madd 000000 format: madd rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as signed integers. the result of this multiplication is added to a 64-bit value that combined special register hi and lo . the lower word of the 64-bit sum from this add operation is sign-extended and loaded to special register lo and the higher word is sign-extended and loaded to special register hi . an integer overflow exception does not occur. operation: 32, 64 t: temp1 gpr[rs] * gpr[rt] temp2 temp1 + (hi 31..0 || lo 31..0 ) lo (temp2 31 ) 32 || temp2 31..0 hi (temp2 63 ) 32 || temp2 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 426 maddu multiply and add word unsigned 26 31 special2 011100 0 25 rs rt 21 20 16 15 0 0000000000 65 maddu 000001 format: maddu rs, rt v r 5500 purpose: combines multiplication and addition of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as unsigned integers. the result of this multiplication is added to a 64-bit value that combined special register hi and lo . the lower word of the 64-bit sum from this add operation is sign-extended and loaded to special register lo and the higher word is sign-extended and loaded to special register hi . an integer overflow exception does not occur. operation: 32, 64 t: temp1 (0 32 || gpr[rs] ) * (0 32 || gpr[rt] ) temp2 temp1 + (hi 31..0 || lo 31..0 ) lo (temp2 31 ) 32 || temp2 31..0 hi (temp2 63 ) 32 || temp2 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 427 mfc0 move from system control coprocessor 26 31 cop0 010000 0 rd 25 rt 21 20 16 15 11 10 0 00000000000 mf 00000 format: mfc0 rt, rd mips i description: the contents of coprocessor register rd of the cp0 are loaded to general-purpose register rt. operation: 32 t: data cpr[0, rd] t + 1: gpr[rt] data 64 t: data cpr[0, rd] t + 1: gpr[rt] (data 31 ) 32 || data 31..0 exceptions: coprocessor unusable exception (64/32-bit user/supervisor mode if cp0 is disabled)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 428 mfcz move from coprocessor z 26 31 copz 0100xx n ote 0 25 rt 21 20 16 15 mf 00000 rd 11 10 0 00000000000 format: mfcz rt, rd mips i description: the contents of general-purpose register rd of the cpz are loaded to general-purpose register rt . operation: 32 t: data cpr[z, rd] t + 1: gpr[rt] data 64 t: if rd 0 = 0 then data cpr[z, rd 4..1 || 0] 31..0 else data cpr[z, rd 4..1 || 0] 63..32 endif t + 1: gpr[rt] (data 31 ) 32 || data exceptions: coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 1 26 0 25 0 0 mfc2 opcode coprocessor no. coprocessor sub-opcode 31 0 30 1 29 0 28 0 27 0 26 1 25 0 0 mfc1 31 0 30 1 29 0 28 0 27 0 26 0 25 0 0 mfc0 24 0 23 0 22 0 21 0 24 0 23 0 22 0 21 0 24 0 23 0 22 0 21 0 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 429 mfhi move from hi 26 31 special 000000 0 rd 25 16 15 11 10 0 00000 65 mfhi 010000 0 0000000000 format: mfhi rd mips i description: the contents of special register hi are loaded to general-purpose register rd . operation: 32, 64 t: gpr[rd] hi exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 430 mflo move from lo 26 31 special 000000 0 rd 25 16 15 11 10 0 00000 65 mflo 010010 0 0000000000 format: mflo rd mips i description: the contents of special register lo are loaded to general-purpose register rd . operation: 32, 64 t: gpr[rd] lo exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 431 mfpc move from performance counter 26 31 cop0 010000 0 25 rt 21 20 16 15 11 10 0 00000 mf 00000 cp0 25 11001 1 1 1 reg 65 format: mfpc rt, reg v r 5500 description: this instruction loads the contents of performance counter reg of cp0 to general-purpose register rt . with the v r 5500, only 0 and 1 are valid as reg . operation: 32 t: data cpr[0, reg] t + 1: gpr[rt] data 64 t: data cpr[0, reg] t + 1: gpr[rt] (data 31 ) 32 || data 31..0 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 432 mfps move from performance event specifier 26 31 cop0 010000 0 25 rt 21 20 16 15 11 10 0 00000 mf 00000 cp0 25 11001 1 0 0 reg 65 format: mfps rt, reg v r 5500 description: this instruction loads the contents of performance event specifier reg of cp0 to general-purpose register rt . with the v r 5500, only 0 and 1 are valid as reg . operation: 32 t: data cpr[0, reg] t + 1: gpr[rt] data 64 t: data cpr[0, reg] t + 1: gpr[rt] (data 31 ) 32 || data 31..0 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 433 movn move conditional on not zero 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 movn 001011 0 00000 rs format: movn rd, rs, rt mips iv purpose: tests the value of a general-purpose register and then conditionally moves the contents of a general-purpose register. description: if the contents of general-purpose register rt are not 0, this instruction moves the contents of general-purpose register rs to general-purpose register rd . operation: 32, 64 t: if gpr[rt] 0 then gpr[rd] gpr[rs] endif exceptions: reserved instruction exception remark the value tested by this instruction is the result of comparison by the slt, slti, sltu, or sltiu instruction with the condition established as true.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 434 movz move conditional on zero 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 movz 001010 0 00000 rs format: movz rd, rs, rt mips iv purpose: tests the value of a general-purpose register and then conditionally moves the contents of a general-purpose register. description: if the contents of general-purpose register rt are 0, this instruction moves the contents of general-purpose register rs to general-purpose register rd . operation: 32, 64 t: if gpr[rt] = 0 then gpr[rd] gpr[rs] endif exceptions: reserved instruction exception remark the value tested by this instruction is the result of comparison by the slt, slti, sltu, or sltiu instruction with the condition established as false.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 435 msac multiply, negate, accumulate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 msac 00111011000 rs format: msac rd, rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is subtracted from the contents of the accumulator and the result of this subtraction is stored in the accumulator. the contents of the accumulator are treated as a 64-bit signed integer. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 436 msachi multiply, negate, accumulate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 msachi 01111011000 rs format: msachi rd, rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is subtracted from the contents of the accumulator and the result of this subtraction is stored in the accumulator. the contents of the accumulator are treated as a 64-bit signed integer. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 437 msachiu unsigned multiply, negate, accumulate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 msachiu 01111011001 rs format: msachiu rd, rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is subtracted from the contents of the accumulator and the result of this subtraction is stored in the accumulator. the contents of the accumulator are treated as a 64-bit unsigned integer. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 438 msacu unsigned multiply, negate, accumulate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 msacu 00111011001 rs format: msacu rd, rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is subtracted from the contents of the accumulator and the result of this subtraction is stored in the accumulator. the contents of the accumulator are treated as a 64-bit unsigned integer. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 (hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 ((hi 31..0 || lo 31..0 ) ? (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 439 msub multiply and subtract word 26 31 special2 011100 0 25 rs rt 21 20 16 15 0 0000000000 65 msub 000100 format: msub rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as signed integers. the result of this multiplication is subtracted from a 64-bit value that combined special register hi and lo . the lower word of the 64-bit sum from this add operation is sign- extended and loaded to special register lo and the higher word is sign-extended and loaded to special register hi . an integer overflow exception does not occur. operation: 32, 64 t: temp1 gpr[rs] * gpr[rt] temp2 (hi 31..0 || lo 31..0 ) ? temp1 lo (temp2 31 ) 32 || temp2 31..0 hi (temp2 63 ) 32 || temp2 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 440 msubu multiply and subtract word unsigned 26 31 special2 011100 0 25 rs rt 21 20 16 15 0 0000000000 65 msubu 000101 format: msubu rs, rt v r 5500 purpose: combines multiplication and subtraction of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as unsigned integers. the result of this multiplication is subtracted from a 64-bit value that combined special register hi and lo . the lower word of the 64-bit sum from this add operation is sign- extended and loaded to special register lo and the higher word is sign-extended and loaded to special register hi . an integer overflow exception does not occur. operation: 32, 64 t: temp1 (0 32 || gpr[rs] ) * (0 32 || gpr[rt] ) temp2 (hi 31..0 || lo 31..0 ) ? temp1 lo (temp2 31 ) 32 || temp2 31..0 hi (temp2 63 ) 32 || temp2 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 441 mtc0 move to system control coprocessor 26 31 cop0 010000 0 rd 25 rt 21 20 16 15 11 10 0 00000000000 mt 00100 format: mtc0 rt, rd mips i description: the contents of general-purpose register rt are loaded to coprocessor register rd of coprocessor 0. because the state of the virtual address translation system may be altered by this instruction, the operation of load instructions, store instructions, and tlb operations immediately prior to and after this instruction are undefined. when using a register used by the mtc0 by means of instructions before and after it, refer to chapter 19 instruction hazards and place the instructions in the appropriate location. operation: 32, 64 t: data gpr[rt] t + 1: cpr[0, rd] data exceptions: coprocessor unusable exception (64/32-bit user/supervisor mode if cp0 is disabled)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 442 mtcz move to coprocessor z 26 31 copz 0100xx n ote 0 25 rt 21 20 16 15 mt 00100 rd 11 10 0 00000000000 format: mtcz rt, rd mips i description: the contents of general-purpose register rd is loaded to cpz general-purpose register rd . operation: 32 t: data gpr[rt] t + 1: cpr[z, rd] data 64 t: data gpr[rt] t + 1: if rd 0 = 0 then cpr[z, rd 4..1 || 0] cpr[z, rd 4..1 || 0] 63..32 || data else cpr[z, rd 4..1 || 0] data || cpr[z, rd 4..1 || 0] 31..0 endif exceptions: coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding . opcode table: 31 0 30 1 29 0 28 0 27 1 26 0 25 0 0 mtc2 opcode coprocessor no. coprocessor sub-opcode 31 0 30 1 29 0 28 0 27 0 26 1 25 0 0 mtc1 31 0 30 1 29 0 28 0 27 0 26 0 25 0 0 mtc0 24 0 23 1 22 0 21 0 24 0 23 1 22 0 21 0 24 0 23 1 22 0 21 0 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 443 mthi move to hi 26 31 special 000000 0 25 rs 21 20 0 000000000000000 65 mthi 010001 format: mthi rs mips i description: the contents of general-purpose register rs are loaded to special register hi . if a mthi operation is executed following a mult, multu, div, or divu instruction, but before any mflo, mfhi, mtlo, or mthi instructions, the contents of special register lo are undefined. operation: 32, 64 t ? 2: hi undefined t ? 1: hi undefined t: hi gpr[rs] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 444 mtlo move to lo 26 31 special 000000 0 25 rs 21 20 0 000000000000000 65 mtlo 010011 format: mtlo rs mips i description: the contents of general-purpose register rs are loaded to special register lo. if an mtlo operation is executed following a mult, multu, div, or divu instruction, but before any mflo, mfhi, mtlo, or mthi instructions, the contents of special register hi are undefined. operation: 32, 64 t ? 2: lo undefined t ? 1: lo undefined t: lo gpr[rs] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 445 mtpc move to performance counter 26 31 cop0 010000 0 25 rt 21 20 16 15 11 10 0 00000 mt 00100 cp0 25 11001 1 1 1 reg 65 format: mtpc rt, reg v r 5500 description: this instruction loads the contents of general-purpose register rt to performance counter reg of cp0. with the v r 5500, only 0 and 1 are valid as reg . operation: 32, 64 t: data gpr[rt] t + 1: cpr[0, reg] data exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 446 mtps move to performance event specifier 26 31 cop0 010000 0 25 rt 21 20 16 15 11 10 0 00000 mt 00100 cp0 25 11001 1 0 0 reg 65 format: mtps rt, reg v r 5500 description: this instruction loads the contents of general-purpose register rt to performance event specifier reg of cp0. with the v r 5500, only 0 and 1 are valid as reg . operation: 32, 64 t: data gpr[rt] t + 1: cpr[0, reg] data exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 447 mul multiply and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mul 00001011000 rs format: mul rd, rs, rt v r 5500 purpose: combines multiplication and transfer of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is subtracted from the contents of the accumulator and the result of this multiplication is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 gpr[rs] * gpr[rt] gpr[rd] 31..0 (gpr[rs] * gpr[rt]) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 448 mul64 multiply and move 26 31 special2 011100 0 rd 25 rt 21 20 16 15 11 10 mul64 000010 rs 65 0 00000 format: mul64 rd, rs, rt v r 5500 purpose: combines multiplication and transfer of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the result is also stored in general-purpose register rd . an integer overflow exception does not occur. the contents of special registers hi and lo are undefined after execution of this instruction. operation: 32, 64 t: gpr[rd] 31..0 (gpr[rs] * gpr[rt]) 31..0 hi undefined lo undefined exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 449 mulhi multiply and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulhi 01001011000 rs format: mulhi rd, rs, rt v r 5500 purpose: combines multiplication and transfer of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 gpr[rs] * gpr[rt] gpr[rd] 31..0 (gpr[rs] * gpr[rt]) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 450 mulhiu unsigned multiply and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulhiu 01001011001 rs format: mulhiu rd, rs, rt v r 5500 purpose: combines multiplication and transfer of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 gpr[rs] * gpr[rt] gpr[rd] 31..0 (gpr[rs] * gpr[rt]) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 451 muls multiply, negate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 muls 00011011000 rs format: muls rd, rs, rt v r 5500 purpose: combines multiplication and inversion of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt and inverts the result. it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator, and the result of this inversion is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 0 ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 (0 ? (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 452 mulshi multiply, negate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulshi 01011011000 rs format: mulshi rd, rs, rt v r 5500 purpose: combines multiplication and inversion of 32-bit signed integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt and inverts the result. it treats both the operands as 32-bit signed integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator, and the result of this inversion is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 0 ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 (0 ? (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 453 mulshiu unsigned multiply, negate, and move hi 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulshiu 01011011001 rs format: mulshiu rd, rs, rt v r 5500 purpose: combines multiplication and inversion of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt and inverts the result. it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator, and the result of this inversion is stored in the accumulator. the higher 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 0 ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 (0 ? (gpr[rs] * gpr[rt])) 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 454 mulsu unsigned multiply, negate, and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulsu 00011011001 rs format: mulsu rd, rs, rt v r 5500 purpose: combines multiplication and inversion of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt and inverts the result. it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator, and the result of this inversion is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 0 ? (gpr[rs] * gpr[rt]) gpr[rd] 31..0 (0 ? (gpr[rs] * gpr[rt])) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 455 mult multiply 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 mult 011000 format: mult rs, rt mips i purpose: multiplies 32-bit signed integers. description: the contents of general-purpose registers rs and rt are multiplied, treating both operands as signed 32-bit integer. no integer overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid 32-bit, sign-extended values. when the operation completes, the lower word of the double result is loaded to special register lo , and the higher word of the double result is loaded to special register hi . in 64-bit mode, the results will be sign-extended and stored. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr[rs] * gpr[rt] lo t 31..0 hi t 63..32 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t gpr[rs] 31..0 * gpr[rt] 31..0 lo (t 31 ) 32 || t 31..0 hi (t 63 ) 32 || t 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 456 multu multiply unsigned 26 31 special 000000 0 25 rs rt 21 20 16 15 0 0000000000 65 multu 011001 format: multu rs, rt mips i purpose: multiplies 32-bit unsigned integers. description: the contents of general-purpose register rs and the contents of general-purpose register rt are multiplied, treating both operands as unsigned values. no overflow exception occurs under any circumstances. in 64-bit mode, the operands must be valid 32-bit, sign-extended values. when the operation completes, the lower word of the double result is loaded to special register lo , and the higher word of the double result is loaded to special register hi . in 64-bit mode, the results will be sign-extended and stored. operation: 32 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr[rs]) * (0 || gpr[rt]) lo t 31..0 hi t 63..32 64 t ? 2: lo undefined hi undefined t ? 1: lo undefined hi undefined t: t (0 || gpr[rs] 31..0 ) * (0 || gpr[rt] 31..0 ) lo (t 31 ) 32 || t 31..0 hi (t 63 ) 32 || t 63..32 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 457 mulu unsigned multiply and move lo 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 mulu 00001011001 rs format: mulu rd, rs, rt v r 5500 purpose: combines multiplication and transfer of 32-bit unsigned integers for execution. description: this instruction multiplies the contents of general-purpose register rs by the contents of general-purpose register rt . it treats both the operands as 32-bit unsigned integers. the lower 32 bits of special register hi and the lower 32 bits of special register lo are combined and used as an accumulator. the result of multiplication is stored in the accumulator. the lower 32 bits of the result are also stored in general-purpose register rd . an integer overflow exception does not occur. operation: 32, 64 t: hi 31..0 || lo 31..0 gpr[rs] * gpr[rt] gpr[rd] 31..0 (gpr[rs] * gpr[rt]) 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 458 nor nor 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 nor 100111 format: nor rd, rs, rt mips i purpose: performs a bit-wise logical nor operation. description: the contents of general-purpose register rs are combined with the contents of general-purpose register rt in a bit- wise logical nor operation. the result is stored in general-purpose register rd . operation: 32, 64 t: gpr[rd] gpr[rs] nor gpr[rt] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 459 or or 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 or 100101 format: or rd, rs, rt mips i purpose: performs a bit-wise logical or operation. description: the contents of general-purpose register rs are combined with the contents of general-purpose register rt in a bit- wise logical or operation. the result is stored in general-purpose register rd . operation: 32, 64 t: gpr[rd] gpr[rs] or gpr[rt] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 460 ori or immediate 26 31 ori 001101 0 25 rs rt 21 20 16 15 immediate format: ori rt, rs, immediate mips i purpose: performs a bit-wise logical or operation with a constant. description: the 16-bit immediate is zero-extended and combined with the contents of general-purpose register rs in a bit-wise logical or operation. the result is stored in general-purpose register rt . operation: 32 t: gpr[rt] gpr[rs] 31..16 || (immediate or gpr[rs] 15..0 ) 64 t: gpr[rt] gpr[rs] 63..16 || (immediate or gpr[rs] 15..0 ) exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 461 pref prefetch (1/2) 26 31 pref 110011 0 offset 25 hint 21 20 16 15 base format: pref hint, offset (base) mips iv purpose: prefetches data from memory. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. it then loads the contents at the specified address position to the data cache. bits 20 to 16 ( hint ) of this instruction indicate how the loaded data is used. note, however, that the contents of hint are only used for the processor to judge if prefetching by this instruction is valid or not, and do not affect the actual operation. hint indicates the following operations. hint operation description 0 load predicts that data is loaded (without modification). fetches data as if it were loaded. 1 to 31 ? reserved this is an auxiliary instruction that improves the program performance. the generated address or the contents of hint do not change the status of the processor or system, or the meaning (purpose) of the program. if this instruction causes a memory access to occur, the access type to be used is determined by the generated address. in other words, the access type used to load/store the generated address is also used for this instruction. however, an access to an uncached area does not occur. if a translation entry to the specified memory position is not in the tlb, data cannot be prefetched from the map area. this is because no translation entry exists in tlb, it means that no access was made to the memory position recently, therefore, no effect can be expected even if data at such a memory position is prefetched. exceptions related to addressing do not occur as a result of executing this instruction. if the condition of an exception is detected, it is ignored, but the prefetch is not executed either. however, even if nothing is prefetched, processing that does not appear, such as writing back a dirty cache line, may be performed.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 462 pref prefetch (2/2) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, cca) addresstranslation (vaddr, data, load) prefetch (cca, paddr, vaddr, data, hint) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, cca) addresstranslation (vaddr, data, load) prefetch (cca, paddr, vaddr, data, hint) exceptions: reserved instruction exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 463 ror rotate right 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 ror 000010 1 00001 sa format: ror rd, rt, sa v r 5500 purpose: arithmetically shifts a word to the right by the fixed number of bits. description: this instruction shifts the contents of general-purpose register rt to the right by the number of bits specified by sa . the lower bit that is shifted out is inserted in the higher bit. the result is stored in general-purpose register rd . operation: 32, 64 t: gpr[rd] gpr[rt] sa ? 1..0 || gpr[rt] 31..sa exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 464 rorv rotate right variable 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 6 5 rorv 000110 1 00001 rs format: rorv rd, rt, sa v r 5500 purpose: arithmetically shifts a word to the right by the specified number of bits. description: this instruction shifts the contents of general-purpose register rt to the right by the number of bits specified by the lower 5 bits of general-purpose register rs . the lower bit that is shifted out is inserted in the higher bit. the result is stored in general-purpose register rd . operation: 32, 64 t: s gpr[rs] 4..0 gpr[rd] gpr[rt] s ? 1..0 || gpr[rt] 31..s exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 465 sb store byte 26 31 sb 101000 0 25 base rt 21 20 16 15 offset format: sb rt, offset (base) mips i purpose: stores a byte in memory. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the least-significant byte of register rt is stored at the effective address. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) byte vaddr 2..0 xor bigendiancpu 3 data gpr[rt] 63 ? 8*byte..0 || 0 8*byte storememory (uncached, byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) byte vaddr 2..0 xor bigendiancpu 3 data gpr[rt] 63 ? 8*byte..0 || 0 8*byte storememory (uncached, byte, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 466 sc store conditional (1/2) 26 31 sc 111000 0 25 base rt 21 20 16 15 offset format: sc rt, offset (base) mips ii purpose: stores a word in memory and completes atomic read-modify-write. description: this instruction sign-extends a 16-bit offset , adds it to the contents of general-purpose register base , and generates a virtual address. the contents of general-purpose register rt are stored in the memory position of the specified address only when the ll bit is set. if another processor or device has changed the target address after the previous ll instruction, or if the eret instruction is executed between the ll and sc instructions, the contents of register rt are not stored in memory, and the sc instruction fails. whether the sc instruction has been successful or not is indicated by the contents of general-purpose register rt after this instruction has been executed. if the sc instruction is successful, the contents of general-purpose register rt are set to 1; they are cleared to 0 if the sc instruction has failed. the operation of the sc instruction is undefined if the address is different from the address used for the last ll instruction. this instruction can be used in the user mode. it is not necessary that cp0 be enabled. an address error exception occurs if the lower 2 bits of the address are not 0. if this instruction has failed and an exception occurs, the exception takes precedence. this instruction is defined to maintain software compatibility with the other v r series processors.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 467 sc store conditional (2/2) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] 31..0 if llbit then storememory (uncached, word, data, paddr, vaddr, data) endif gpr[rt] 0 31 || llbit 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] 31..0 if llbit then storememory (uncached, word, data, paddr, vaddr, data) endif gpr[rt] 0 63 || llbit exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 468 scd store conditional doubleword (1/2) 26 31 scd 111100 0 25 base rt 21 20 16 15 offset format: scd rt, offset (base) mips iii purpose: stores a doubleword in memory and completes atomic read-modify-write. description: this instruction sign-extends a 16-bit offset , adds it to the contents of general-purpose register base , and generates a virtual address. the contents of general-purpose register rt are stored in the memory position of the specified address only when the ll bit is set. if another processor or device has changed the target address after the previous lld instruction, or if the eret instruction is executed between the lld and scd instructions, the contents of register rt are not stored in memory, and the scd instruction fails. whether the scd instruction has been successful or not is indicated by the contents of general-purpose register rt after this instruction has been executed. if the scd instruction is successful, the contents of general-purpose register rt are set to 1; they are cleared to 0 if the scd instruction has failed. the operation of the scd instruction is undefined if the address is different from the address used for the last lld instruction. this instruction can be used in the user mode. it is not necessary that cp0 be enabled. an address error exception occurs if the lower 3 bits of the address are not 0. if this instruction has failed and an exception occurs, the exception takes precedence. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. this instruction is defined to maintain software compatibility with the other v r series processors. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] if llbit then storememory (uncached, doubleword, data, paddr, vaddr, data) endif gpr[rt] 0 63 || llbit remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 469 scd store conditional doubleword (2/2) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 470 sd store doubleword 26 31 sd 111111 0 25 base rt 21 20 16 15 offset format: sd rt, offset (base) mips iii purpose: stores a doubleword in memory. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of general-purpose register rt are stored at the memory location specified by the effective address. an address error exception occurs if the lower 3 bits of the address are not 0. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] storememory (uncached, doubleword, data, paddr, vaddr, data) remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode. exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 471 sdcz store doubleword from coprocessor z (1/2) 26 31 sdcz 1111xx n ote 0 25 base rt 21 20 16 15 offset format: sdcz rt, offset (base) mips ii purpose: stores a doubleword in memory from the coprocessor general-purpose register. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the doubleword at cpz register rt are stored in the memory location specified by the effective address. data to be stored is defined for each processor. an address error exception occurs if the lower 3 bits of the address are not 0. this instruction set to cp0 is invalid. if cp1 is specified and if the fr bit of the status register is 0 and the least significant bit of the rt field is not 0, the operation of this instruction is undefined. if the fr bit is 1, an odd or even register is specified by rt . operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] storememory (uncached, doubleword, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] storememory (uncached, doubleword, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding .
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 472 sdcz store doubleword from coprocessor z (2/2) opcode table: 31 1 30 1 29 1 28 1 27 1 26 0 0 sdc2 opcode coprocessor no. 31 1 30 1 29 1 28 1 27 0 26 1 0 sdc1 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 473 sdl store doubleword left (1/3) 31 25 26 20 21 15 16 0 sdl 101100 base rt offset format: sdl rt, offset (base) mips iii purpose: stores the most significant part of a doubleword in unaligned memory. description: this instruction can be used in combination with the sdr instruction when storing a doubleword data in the register in a doubleword that does not exist at a doubleword boundary in the memory. the sdl instruction stores the higher word of the data, and the sdr instruction stores the lower word of the data in the memory. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. among the doubleword data in the memory whose most significant byte is the byte specified by the virtual address, the higher portion of general-purpose register rt is stored in the memory at the same doubleword boundary as the target address. the number of bytes to be stored varies from one to eight depending on the byte specified. in other words, the most significant byte of general-purpose register rt is stored in the memory specified by the virtual address. as long as there are lower bytes among the bytes at the same doubleword boundary, the operation to store the byte in the next byte of the memory will be continued. 15 7 14 6 13 5 12 4 11 3 10 2 9 1 8 0 address 8 address 0 memory (little endian) before storing a b c d e f g h $24 register sdl $24,8 ($0) 15 7 14 6 13 5 12 4 11 3 10 2 9 1 a 0 address 8 address 0 after storing
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 474 sdl store doubleword left (2/3) an address error exception caused by the specified address not being aligned at a doubleword boundary does not occur. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr 31..3 || 0 3 endif byte vaddr 2..0 xor bigendiancpu 3 data 0 56 ? 8*byte || gpr[rt] 63..56 ? 8*byte storememory (uncached, byte, data, paddr, vaddr, data) remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 475 sdl store doubleword left (3/3) the relationship between the address assigned to the sdl instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 ijklmnoa ijklmnab ijklmabc ijklabcd ijkabcde ijabcdef iabcdefg abcdefgh 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 7 6 5 4 3 2 1 0 abcdefgh iabcdefg ijabcdef ijkabcde ijklabcd ijklmabc ijklmnab ijklmnoa 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 remark type accesstype (see figure 3-3 byte specification related to load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 476 sdr store doubleword right (1/3) 31 25 26 20 21 15 16 0 sdr 101101 base rt offset format: sdr rt, offset (base) mips iii purpose: stores the least significant part of a doubleword in unaligned memory. description: this instruction can be used in combination with the sdl instruction when storing a doubleword data in the register in a doubleword that does not exist at a doubleword boundary in the memory. the sdl instruction stores the higher word of the data, and the sdr instruction stores the lower word of the data in the memory. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. among the doubleword data in the memory whose least significant byte is the byte specified by the virtual address, the lower portion of general-purpose register rt is stored in the memory at the same doubleword boundary as the target address. the number of bytes to be stored varies from one to eight depending on the byte specified. in other words, the least significant byte of general-purpose register rt is stored in the memory specified by the virtual address. as long as there are higher bytes among the bytes at the same doubleword boundary, the operation to store the byte in the next byte of the memory will be continued. 15 7 14 6 13 5 12 4 11 3 10 2 9 1 8 0 address 8 address 0 memory (little endian) before storing a b c d e f g h $24 register sdr $24,1 ($0) 15 b 14 c 13 d 12 e 11 f 10 g 9 h 8 0 address 8 address 0 after storing
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 477 sdr store doubleword right (2/3) an address error exception caused by the specified address not being aligned at a doubleword boundary does not occur. this operation is defined in the 64-bit mode and 32-bit kernel mode. a reserved instruction exception occurs if this instruction is executed in the 32-bit user mode or supervisor mode. operation: 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr psize ? 1..3 || 0 3 endif byte vaddr 2..0 xor bigendiancpu 3 data gpr[rt] 63 ? 8*byte || 0 8*byte storememory (uncached, doubleword-byte, data, paddr, vaddr, data) remark the higher 32 bits are ignored when a virtual address is generated in the 32-bit kernel mode.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 478 sdr store doubleword right (3/3) the relationship between the address assigned to the sdr instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 abcdefgh bcdefghp cdefghop defghnop efghmnop fghlmnop ghklmnop hjklmnop 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 0 0 0 0 0 0 0 0 hjklmnop ghklmnop fghlmnop efghmnop defghnop cdefghop bcdefghp abcdefgh 0 1 2 3 4 5 6 7 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 remark type accesstype (see figure 3-3 byte specification related load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception (32-bit user/supervisor mode)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 479 sh store halfword 26 31 sh 101001 0 25 base rt 21 20 16 15 offset format: sh rt, offset (base) mips i purpose: stores a halfword in memory. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate an unsigned effective address. the least-significant halfword of register rt is stored at the effective address. an address error exception occurs if the least-significant bit of the address is not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor(reverseendian 2 || 0)) byte vaddr 2..0 xor(bigendiancpu 2 || 0) data gpr[rt] 63 ? 8*byte..0 || 0 8*byte storememory (uncached, halfword, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor(reverseendian 2 || 0)) byte vaddr 2..0 xor(bigendiancpu 2 || 0) data gpr[rt] 63 ? 8*byte..0 || 0 8*byte storememory (uncached, halfword, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 480 sll shift left logical 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 sll 000000 sa format: sll rd, rt, sa mips i purpose: logically shifts a word to the left by the fixed number of bits. description: the contents of general-purpose register rt are shifted left by sa bits, inserting zeros into the lower bits. the result is stored in general-purpose register rd. in 64-bit mode, the shifted 32-bit value is sign-extended and stored. when the shift amount is set to zero, sll sign-extends lower 32 bits of a 64-bit value. using this instruction, the 64-bit value can be generated from a 32-bit value. operation: 32 t: gpr[rd] gpr[rt] 31 ? sa..0 || 0 sa 64 t: s 0 || sa temp gpr[rt] 31 ? s..0 || 0 s gpr[rd] (temp 31 ) 32 || temp exceptions: none caution sll with a shift amount of zero may be treated as a nop by some assemblers, at some optimization levels. if using sll with a purpose of sign-extension, check the assembler specification.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 481 sllv shift left logical variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 sllv 000100 format: sllv rd, rt, rs mips i purpose: logically shifts a word to the left by the specified number of bits. description: the contents of general-purpose register rt are shifted left the number of bits specified by the lower 5 bits contained in general-purpose register rs , inserting zeros into the lower bits. the result is stored in general- purpose register rd . in 64-bit mode, the shifted 32-bit value is sign-extended and stored. when the shift amount is set to zero, sllv sign-extends lower 32 bits of a 64-bit value. using this instruction, the 64-bit value can be generated from a 32-bit value. operation: 32 t: s gpr[rs] 4..0 gpr[rd] gpr[rt] (31 ? s)..0 || 0 s 64 t: s 0 || gpr[rs] 4..0 temp gpr[rt] (31 ? s)..0 || 0 s gpr[rd] (temp 31 ) 32 || temp exceptions: none caution sllv with a shift amount of zero may be treated as a nop by some assemblers, at some optimization levels. if using sllv with a purpose of sign-extension, check the assembler specification.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 482 slt set on less than 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 slt 101010 format: slt rd, rs, rt mips i purpose: stores the result of unequal comparison. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs . considering both quantities as signed integers, if the contents of general-purpose register rs are less than the contents of general-purpose register rt , the result is set to one; otherwise the result is set to zero. no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if gpr[rs] < gpr[rt] then gpr[rd] 0 31 || 1 else gpr[rd] 0 32 endif 64 t: if gpr[rs] < gpr[rt] then gpr[rd] 0 63 || 1 else gpr[rd] 0 64 endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 483 slti set on less than immediate 26 31 slti 001010 0 25 rs rt 21 20 16 15 immediate format: slti rt, rs, immediate mips i purpose: stores the result of unequal comparison with a constant. description: the 16-bit immediate is sign-extended and subtracted from the contents of general-purpose register rs. considering both quantities as signed integers, if rs is less than the sign-extended immediate , the result is set to 1; otherwise the result is set to 0. no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if gpr[rs] < (immediate 15 ) 16 || immediate 15..0 then gpr[rt] 0 31 || 1 else gpr[rt] 0 32 endif 64 t: if gpr[rs] < (immediate 15 ) 48 || immediate 15..0 then gpr[rt] 0 63 || 1 else gpr[rt] 0 64 endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 484 sltiu set on less than immediate unsigned 26 31 sltiu 001011 0 25 rs rt 21 20 16 15 immediate format: sltiu rt, rs, immediate mips i purpose: stores the result of unsigned unequal comparison with a constant. description: the 16-bit immediate is sign-extended and subtracted from the contents of general-purpose register rs. considering both quantities as unsigned integers, if rs is less than the sign-extended immediate , the result is set to 1; otherwise the result is set to 0. no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if (0 || gpr[rs] ) < (immediate 15 ) 16 || immediate 15..0 then gpr[rt] 0 31 || 1 else gpr[rt] 0 32 endif 64 t: if (0 || gpr[rs] ) < (immediate 15 ) 48 || immediate 15..0 then gpr[rt] 0 63 || 1 else gpr[rt] 0 64 endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 485 sltu set on less than unsigned 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 sltu 101011 format: sltu rd, rs, rt mips i purpose: stores the result of unsigned unequal comparison. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs. considering both quantities as unsigned integers, if the contents of general-purpose register rs are less than the contents of general-purpose register rt , the result is set to 1; otherwise the result is set to 0. no integer overflow exception occurs under any circumstances. the comparison is valid even if the subtraction used during the comparison overflows. operation: 32 t: if (0 || gpr[rs] ) < 0 || gpr[rt] then gpr[rd] 0 31 || 1 else gpr[rd] 0 32 endif 64 t: if (0 || gpr[rs] ) < 0 || gpr[rt] then gpr[rd] 0 63 || 1 else gpr[rd] 0 64 endif exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 486 sra shift right arithmetic 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 sra 000011 sa format: sra rd, rt, sa mips i purpose: arithmetically shifts a word to the right by the fixed number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by sa , sign-extending the higher bits. the result is stored in general-purpose register rd . in 64-bit mode, the operand must be a valid sign-extended, 32-bit value. operation: 32 t: gpr[rd] (gpr[rt] 31 ) sa || gpr[rt] 31..sa 64 t: s 0 || sa temp (gpr[rt] 31 ) s || gpr[rt] 31..s gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 487 srav shift right arithmetic variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 srav 000111 format: srav rd, rt, rs mips i purpose: arithmetically shifts a word to the right by the specified number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by the lower 5 bits of general-purpose register rs , sign-extending the higher bits. the result is stored in general-purpose register rd . in 64-bit mode, the operand must be a valid sign-extended, 32-bit value. operation: 32 t: s gpr[rs] 4..0 gpr[rd] (gpr[rt] 31 ) s || gpr[rt] 31..s 64 t: s gpr[rs] 4..0 temp (gpr[rt] 31 ) s || gpr[rt] 31..s gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 488 srl shift right logical 26 31 special 000000 0 rd 25 rt 21 20 16 15 11 10 0 00000 65 srl 000010 sa format: srl rd, rt, sa mips i purpose: logically shifts a word to the right by the fixed number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by sa , inserting zeros into the higher bits. the result is stored in general-purpose register rd . in 64-bit mode, the operand must be a valid sign-extended, 32-bit value. operation: 32 t: gpr[rd] 0 sa || gpr[rt] 31..sa 64 t: s 0 || sa temp 0 s || gpr[rt] 31..s gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 489 srlv shift right logical variable 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 srlv 000110 format: srlv rd, rt, rs mips i purpose: logically shifts a word to the right by the specified number of bits. description: the contents of general-purpose register rt are shifted right by the number of bits specified by the lower 5 bits of general-purpose register rs, inserting zeros into the higher bits. the result is stored in general-purpose register rd . in 64-bit mode, the operand must be a valid sign-extended, 32-bit value. operation: 32 t: s gpr[rs] 4..0 gpr[rd] 0 s || gpr[rt] 31..s 64 t: s gpr[rs] 4..0 temp 0 s || gpr[rt] 31..s gpr[rd] (temp 31 ) 32 || temp exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 490 ssnop superscalar nop 0 00000 1 00001 0 00000 26 31 special 000000 0 25 21 20 16 15 11 10 0 00000 65 sll 000000 format: ssnop v r 5500 description: this instruction consumes the execution time of one instruction without affecting the status of the processor or data. actually, execution of the next instruction is postponed until all the instructions executed before this instruction pass through the commit stage. if this instruction is in the branch delay slot, the cpu waits until all the instructions executed before the branch instruction immediately before pass through the commit stage. execution of the next instruction is also postponed until all writeback to memory by the load instruction that is executed to the non-blocking area before this instruction is completed. operation: 32, 64 t: gpr0 gpr0 30..0 || 0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 491 sub subtract 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 sub 100010 format: sub rd, rs, rt mips i purpose: subtracts a 32-bit integer. a trap is performed if an overflow occurs. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs, and the result is stored in general-purpose register rd. in 64-bit mode, the operands must be valid sign-extended, 32- bit values. an integer overflow exception occurs if the carries out of bits 30 and 31 differ (2's complement overflow). the destination register rd is not modified when an integer overflow exception occurs. operation: 32 t: gpr[rd] gpr[rs] ? gpr[rt] 64 t: temp gpr[rs] ? gpr[rt] gpr[rd] (temp 31 ) 32 || temp 31..0 exceptions: integer overflow exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 492 subu subtract unsigned 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 subu 100011 format: subu rd, rs, rt mips i purpose: subtracts a 32-bit integer. description: the contents of general-purpose register rt are subtracted from the contents of general-purpose register rs , and the result is stored in general-purpose register rd . in 64-bit mode, the operands must be valid sign-extended, 32- bit values. the only difference between this instruction and the sub instruction is that subu never causes an integer overflow exception. operation: 32 t: gpr[rd] gpr[rs] ? gpr[rt] 64 t: temp gpr[rs] ? gpr[rt] gpr[rd] (temp 31 ) 32 || temp 31..0 exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 493 sw store word 26 31 sw 101011 0 25 base rt 21 20 16 15 offset format: sw rt, offset (base) mips i purpose: stores a word in memory. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of general-purpose register rt are stored at the memory location specified by the effective address. an address error exception occurs if the lower 2 bits of the address are not 0. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] 31..0 storememory (uncached, word, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) data gpr[rt] 31..0 storememory (uncached, word, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 494 swcz store word from coprocessor z (1/2) 26 31 swcz 1110xx n ote 0 25 base rt 21 20 16 15 offset format: swcz rt, offset (base) mips i purpose: stores a word in memory from the coprocessor general-purpose register. description: the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. the contents of the cpz register rt are stored in the memory location specified by the effective address. data to be stored is defined for each processor. if the lower 2 bits of the address are not 0, an address error exception occurs. this instruction set to cp0 is invalid. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian || 0 2 )) byte vaddr 2..0 xor (bigendiancpu || 0 2 ) data copzsw (byte, rt) storememory (uncached, word, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor (reverseendian || 0 2 )) byte vaddr 2..0 xor (bigendiancpu || 0 2 ) data copzsw (byte, rt) storememory (uncached, word, data, paddr, vaddr, data) exceptions: tlb refill exception tlb invalid exception bus error exception address error exception coprocessor unusable exception note see the opcode table below, or 17.4 cpu instruction opcode bit encoding .
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 495 swcz store word from coprocessor z (2/2) opcode table: 31 1 30 1 29 1 28 0 27 1 26 0 0 swc2 opcode coprocessor no. 31 1 30 1 29 1 28 0 27 0 26 1 0 swc1 remark coprocessor 2 is reserved in the v r 5500.
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 496 swl store word left (1/3) 31 25 26 20 21 15 16 0 swl 101010 base rt offset format: swl rt, offset (base) mips i purpose: stores the most significant part of a word in unaligned memory. description: this instruction can be used in combination with the swr instruction when storing a word data in the register in a word that does not exist at a word boundary in the memory. the swl instruction stores the higher word of the data, and the swr instruction stores the lower word of the data in the memory. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. among the word data in the memory whose most significant byte is the byte specified by the virtual address, the higher portion of general-purpose register rt is stored in the memory at the same word boundary as the target address. the number of bytes to be stored varies from one to four depending on the byte specified. in other words, the most significant byte of general-purpose register rt is stored in the memory specified by the virtual address. as long as there are lower bytes among the bytes at the same word boundary, the operation to store the byte in the next byte of the memory will be continued. an address error exception caused by the specified address not being aligned at a word boundary does not occur. after storing 7 3 6 2 5 1 4 0 address 4 address 0 memory (little endian) before storing a b c d $24 register swl $24,4 ($0) 7 3 6 2 5 1 a 0 address 4 address 0
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 497 swl store word left (2/3) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr 31..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || 0 24 ? 8*byte || gpr[rt] 31..24 ? 8*byte else data 0 24 ? 8*byte || gpr[rt] 31..24 ? 8*byte || 0 32 endif storememory (uncached, byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr 31..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || 0 24 ? 8*byte || gpr[rt] 31..24 ? 8*byte else data 0 24 ? 8*byte || gpr[rt] 31..24 ? 8*byte || 0 32 endif storememory (uncached, byte, data, paddr, vaddr, data)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 498 swl store word left (3/3) the relationship between the address assigned to the swl instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 ijklmnoe ijklmnef ijklmefg ijklefgh ijkemnop ijefmnop iefgmnop efghmnop 0 1 2 3 0 1 2 3 0 0 0 0 4 4 4 4 7 6 5 4 3 2 1 0 efghmnop iefgmnop ijefmnop ijkemnop ijklefgh ijklmefg ijklmnef ijklmnoe 3 2 1 0 3 2 1 0 4 4 4 4 0 0 0 0 0 1 2 3 4 5 6 7 remark type accesstype (see figure 3-3 byte specification related load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 499 swr store word right (1/3) 31 25 26 20 21 15 16 0 swr 101110 base rt offset format: swr rt, offset (base) mips i purpose: stores the least significant part of a word in unaligned memory. description: this instruction can be used in combination with the swl instruction when storing a word data in the register in a word that does not exist at a word boundary in the memory. the swl instruction stores the higher word of the data, and the swr instruction stores the lower word of the data in the memory. the 16-bit offset is sign-extended and added to the contents of general-purpose register base to generate a virtual address. among the word data in the memory whose least significant byte is the byte specified by the virtual address, the lower portion of general-purpose register rt is stored in the memory at the same word boundary as the target address. the number of bytes to be stored varies from one to four depending on the byte specified. in other words, the least significant byte of general-purpose register rt is stored in the memory specified by the virtual address. as long as there are higher bytes among the bytes at the same word boundary, the operation to store the byte in the next byte of the memory will be continued. an address error exception caused by the specified address not being aligned at a word boundary does not occur. after storing 7 3 6 2 5 1 4 0 address 4 address 0 memory (little endian) before storing a b c d $24 register swr $24,1 ($0) 7 b 6 c 5 d 4 0 address 4 address 0
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 500 swr store word right (2/3) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr 31..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || gpr[rt] 31 ? 8*byte..0 || 0 8*byte else data gpr[rt] 31 ? 8*byte || 0 8*byte || 0 32 endif storememory (uncached, word ? byte, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr[base] (paddr, uncached) addresstranslation (vaddr, data) paddr paddr psize ? 1..3 || (paddr 2..0 xor reverseendian 3 ) if bigendianmem = 0 then paddr paddr 31..2 || 0 2 endif byte vaddr 1..0 xor bigendiancpu 2 if (vaddr 2 xor bigendiancpu) = 0 then data 0 32 || gpr[rt] 31 ? 8*byte..0 || 0 8*byte else data gpr[rt] 31 ? 8*byte || 0 8*byte || 0 32 endif storememory (uncached, word ? byte, data, paddr, vaddr, data)
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 501 swr store word right (3/3) the relationship between the address assigned to the swr instruction and its result (each byte of the register) is shown below. abcdefgh register ijklmnop memory bigendiancpu = 0 bigendiancpu = 1 vaddr 2..0 offset offset destination type lem bem destination type lem bem 0 1 2 3 4 5 6 7 ijklefgh ijklfghp ijklghop ijklhnop efghmnop fghlmnop ghklmnop hjklmnop 3 2 1 0 3 2 1 0 0 1 2 3 4 5 6 7 4 4 4 4 0 0 0 0 hjklmnop ghklmnop fghlmnop efghmnop ijklhnop ijklghop ijklfghp ijklefgh 0 1 2 3 0 1 2 3 7 6 5 4 3 2 1 0 0 0 0 0 4 4 4 4 remark type accesstype (see figure 3-3 byte specification related load and store instruction ) output to memory offset paddr 2..0 output to memory lem little-endian memory (bigendianmem = 0) bem big-endian memory (bigendianmem = 1) exceptions: tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 502 sync synchronize 26 31 special 000000 0 25 6 5 sync 001111 0 000000000000000 stype 11 10 format: sync mips ii purpose: determines the order in which the common memory is referenced by a load/store instruction in a multi-processor environment. description: the sync instruction is executed as a nop on the v r 5500. this instruction is defined to maintain software compatibility with the other v r series processors. actually, execution of the next instruction is postponed until all the instructions executed before this instruction pass through the commit stage. if this instruction is in the branch delay slot, the cpu waits until all the instructions executed before the branch instruction immediately before pass through the commit stage. execution of the next instruction is postponed until all the system interface requests by the load/store instruction executed before this instruction are issued. in this way, external access or writeback to memory can be processed in the same sequence as the load/store instructions that are executed before or after the sync instruction. the cpu does not wait for issuance of a system interface request by an instruction other than a load/store instruction, or issuance of instruction fetch. the processor treats stype field as 0 regardless of the value of this field. operation: 32, 64 t: syncoperation () exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 503 syscall system call 26 31 special 000000 0 code 25 6 5 syscall 001100 format: syscall mips i purpose: generates a system call exception. description: a system call exception occurs, immediately and unconditionally transferring control to the exception handler. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: systemcallexception exceptions: system call exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 504 teq trap if equal 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 teq 110100 format: teq rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to general-purpose register rs . if the contents of general-purpose register rs are equal to the contents of general-purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr[rs] = gpr[rt] then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 505 teqi trap if equal immediate 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate teqi 01100 format: teqi rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . if the contents of general-purpose register rs are equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr[rs] = (immediate 15 ) 16 || immediate 15..0 then trapexception endif 64 t: if gpr[rs] = (immediate 15 ) 48 || immediate 15..0 then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 506 tge trap if greater than or equal 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 tge 110000 format: tge rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to the contents of general-purpose register rs . considering both quantities as signed integers, if the contents of general-purpose register rs are greater than or equal to the contents of general-purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr[rs] gpr[rt] then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 507 tgei trap if greater than or equal immediate 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate tgei 01000 format: tgei rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . considering both quantities as signed integers, if the contents of general-purpose register rs are greater than or equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr[rs] (immediate 15 ) 16 || immediate 15..0 then trapexception endif 64 t: if gpr[rs] (immediate 15 ) 48 || immediate 15..0 then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 508 tgeiu trap if greater than or equal immediate unsigned 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate tgeiu 01001 format: tgeiu rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . considering both quantities as unsigned integers, if the contents of general-purpose register rs are greater than or equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if (0 || gpr[rs] ) (0 || (immediate 15 ) 16 || immediate 15..0 ) then trapexception endif 64 t: if (0 || gpr[rs] ) (0 || (immediate 15 ) 48 || immediate 15..0 ) then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 509 tgeu trap if greater than or equal unsigned 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 tgeu 110001 format: tgeu rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to the contents of general-purpose register rs . considering both quantities as unsigned integers, if the contents of general-purpose register rs are greater than or equal to the contents of general-purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if (0 || gpr[rs] ) (0 || gpr[rt] ) then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 510 tlbp probe tlb for matching entry 26 31 cop0 010000 0 25 co 1 24 65 tlbp 001000 0 0000000000000000000 format: tlbp mips i description: the index register is loaded with the address of the tlb entry whose contents match the contents of the entryhi register. if no tlb entry matches, the higher bit of the index register is set. if two or more tlb entries that match the contents of the entryhi register have been found, the ts bit of the status register is set to 1, and a tlb refill exception occurs. the operation is undefined if this instruction is executed immediately after the tlbp instruction and if an operation related to memory referencing takes place. this operation is defined in kernel mode or when cp0 is enabled. execution of this instruction in user/supervisor mode or when cp0 is not enabled causes a coprocessor unusable exception. operation: 32 t: index 1 || 0 25 || undefined 6 for i in 0..tlbentries ? 1 if ((tlb[i] 95..77 and not tlb[i] 120..109 ) = (entryhi 31..12 and not tlb[i] 120..109 )) and (tlb[i] 76 or (tlb[i] 71..64 = entryhi 7..0 )) then index 0 26 || i 5..0 endif endfor 64 t: index 1 || 0 25 || undefined 6 for i in 0..tlbentries ? 1 if (tlb[i] 171..141 and not (0 15 || tlb[i] 216..205 )) = (entryhi 43..13 and not (0 15 || tlb[i] 216..205 )) and (tlb[i] 140 or (tlb[i] 135..128 = entryhi 7..0 )) then index 0 26 || i 5..0 endif endfor exceptions: coprocessor unusable exception tlb refill exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 511 tlbr read indexed tlb entry 26 31 cop0 010000 0 25 co 1 24 65 tlbr 000001 0 0000000000000000000 format: tlbr mips i description: the entryhi and entrylo registers are loaded with the contents of the tlb entry pointed at by the contents of the tlb index register. the g bit (which controls asid matching) read from the tlb is written to both of the entrylo0 and entrylo1 registers. the g bit of the tlb is written with the logical and of the g bits in the entrylo0 and entrylo1 registers. the operation is invalid if the contents of the tlb index register are greater than the number of tlb entries in the processor. this operation is defined in kernel mode or when cp0 is enabled. execution of this instruction in user/supervisor mode or when cp0 is not enabled causes a coprocessor unusable exception. operation: 32 t: pagemask tlb[index 5..0 ] 127..96 entryhi tlb[index 5..0 ] 95..64 and not tlb[index 5..0 ] 127..96 entrylo1 tlb[index 5..0 ] 63..32 entrylo0 tlb[index 5..0 ] 31..0 64 t: pagemask tlb[index 5..0 ] 255..192 entryhi tlb[index 5..0 ] 191..128 and not tlb[index 5..0 ] 255..192 entrylo1 tlb[index 5..0 ] 127..65 || tlb[index 5..0 ] 140 entrylo0 tlb[index 5..0 ] 63..1 || tlb[index 5..0 ] 140 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 512 tlbwi write indexed tlb entry 26 31 cop0 010000 0 25 co 1 24 65 tlbwi 000010 0 0000000000000000000 format: tlbwi mips i description: the tlb entry pointed at by the contents of the tlb index register is loaded with the contents of the entryhi and entrylo registers. the g bit of the tlb is written with the logical and of the g bits in the entrylo0 and entrylo1 registers. the operation is invalid if the contents of the tlb index register are greater than the number of tlb entries in the processor. operation: 32, 64 t: tlb[index 5..0 ] pagemask || (entryhi and not pagemask) || entrylo1 || entrylo0 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 513 tlbwr write random tlb entry 26 31 cop0 010000 0 25 co 1 24 65 tlbwr 000110 0 0000000000000000000 format: tlbwr mips i description: the tlb entry pointed at by the contents of the tlb random register is loaded with the contents of the entryhi and entrylo registers. the g bit of the tlb is written with the logical and of the g bits in the entrylo0 and entrylo1 registers. operation: 32, 64 t: tlb[random 5..0 ] pagemask || (entryhi and not pagemask) || entrylo1 || entrylo0 exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 514 tlt trap if less than 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 tlt 110010 format: tlt rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to general-purpose register rs . considering both quantities as signed integers, if the contents of general-purpose register rs are less than the contents of general- purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr[rs] < gpr[rt] then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 515 tlti trap if less than immediate 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate tlti 01010 format: tlti rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . considering both quantities as signed integers, if the contents of general-purpose register rs are less than the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr[rs] < (immediate 15 ) 16 || immediate 15..0 then trapexception endif 64 t: if gpr[rs] < (immediate 15 ) 48 || immediate 15..0 then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 516 tltiu trap if less than immediate unsigned 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate tltiu 01011 format: tltiu rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . considering both quantities as unsigned integers, if the contents of general-purpose register rs are less than the sign- extended immediate , a trap exception occurs. operation: 32 t: if (0 || gpr[rs] ) < (0 || (immediate 15 ) 16 || immediate 15..0 ) then trapexception endif 64 t: if (0 || gpr[rs] ) < (0 || (immediate 15 ) 48 || immediate 15..0 ) then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 517 tltu trap if less than unsigned 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 tltu 110011 format: tltu rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to general-purpose register rs . considering both quantities as unsigned integers, if the contents of general-purpose register rs are less than the contents of general-purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if (0 || gpr[rs] ) < (0 || gpr[rt] ) then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 518 tne trap if not equal 26 31 special 000000 0 code 25 rs rt 21 20 16 15 6 5 tne 110110 format: tne rs, rt mips ii purpose: compares general-purpose registers and executes a conditional trap. description: the contents of general-purpose register rt are compared to general-purpose register rs . if the contents of general-purpose register rs are not equal to the contents of general-purpose register rt , a trap exception occurs. the code field is available for use as software parameters, but is retrieved by the exception handler only by loading the contents of the memory word containing the instruction. operation: 32, 64 t: if gpr[rs] gpr[rt] then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 519 tnei trap if not equal immediate 26 31 regimm 000001 0 25 rs 21 20 16 15 immediate tnei 01110 format: tnei rs, immediate mips ii purpose: compares a general-purpose register and a constant and executes a conditional trap. description: the 16-bit immediate is sign-extended and compared to the contents of general-purpose register rs . if the contents of general-purpose register rs are not equal to the sign-extended immediate , a trap exception occurs. operation: 32 t: if gpr[rs] (immediate 15 ) 16 || immediate 15..0 then trapexception endif 64 t: if gpr[rs] (immediate 15 ) 48 || immediate 15..0 then trapexception endif exceptions: trap exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 520 wait wait 26 31 cop0 010000 0 25 co 1 24 65 wait 100000 implementation-dependent information format: wait v r 5500 purpose: sets the cpu in the standby mode. description: this instruction places the processor in the standby mode. the processor is kept waiting by this instruction until all the instructions executed before pass through the commit stage. it stops the operation of the pipeline after all the system interface requests, instruction fetch, and writeback to memory have been completed. if all the bits 10 to 6 of the instruction code are cleared to 0, the processor also stops the clock supply. if these bits are not cleared, the clock continued to be supplied. to release from the standby mode, execute either a reset, nmi request, or all of the enabled interrupts. when the processor has been released from the standby mode, an exception occurs, and the address of the instruction next to the wait instruction is stored in the epc/errorepc register. the operation of the processor is undefined if this instruction is in the branch delay slot. the operation is also undefined if this instruction is executed when the exl and erl bits of the status register are set to 1. this operation is defined in kernel mode or when cp0 is enabled. execution of this instruction in user/supervisor mode or when cp0 is not enabled causes a coprocessor unusable exception. operation: 32, 64 t: standby operation () if implementation-dependent information 4..0 = 0 then pipeline clock stop else pipeline clock not stop endif exceptions: coprocessor unusable exception
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 521 xor exclusive or 26 31 special 000000 0 rd 25 rs rt 21 20 16 15 11 10 0 00000 65 xor 100110 format: xor rd, rs, rt mips i purpose: performs a bit-wise logical xor operation. description: the contents of general-purpose register rs are combined with the contents of general-purpose register rt in a bit- wise logical exclusive or operation. the result is stored in general-purpose register rd. operation: 32, 64 t: gpr[rd] gpr[rs] xor gpr[rt] exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 522 xori exclusive or immediate 26 31 xori 001110 0 25 rs rt 21 20 16 15 immediate format: xori rt, rs, immediate mips i purpose: performs a bit-wise logical xor operation with a constant. description: the 16-bit immediate is zero-extended and combined with the contents of general-purpose register rs in a bit-wise logical exclusive or operation. the result is stored in general-purpose register rt. operation: 32 t: gpr[rt] gpr[rs] xor (0 16 || immediate) 64 t: gpr[rt] gpr[rs] xor (0 48 || immediate) exceptions: none
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 523 17.4 cpu instruction opcode bit encoding figure 17-1 lists the v r 5500 opcode (isa and extended isa) encoding. figure 17-1. cpu instruction opcode bit encoding (1/2) 28...26 opcode 31...29 01234567 0 special regimm j jal beq bne blez bgtz 1 addi addiu slti sltiu andi ori xori lui 2 cop0 cop1 cop2 cop1x beql bnel blezl bgtzl 3 daddi daddiu ldl ldr special2 * * * 4 lb lh lwl lw lbu lhu lwr lwu 5 sb sh swl sw sdl sdr swr cache 6 ll lwc1 * pref lld ldc1 * ld 7scswc1* * scd sdc1 * sd 2...0 special function 5...3 01234567 0 sll/ssnop * srl sra sllv * srlv srav 1 jr jalr movz movn syscall break * sync 2 mfhi mthi mflo mtlo dsllv *dsrlv dsrav 3mult multu div divu dmult dmultu ddiv ddivu 4 add addu sub subu and or xor nor 5 * * slt sltu dadd daddu dsub dsubu 6 tge tgeu tlt tltu teq * tne * 7dsll *dsrl dsra dsll32 * dsrl32 dsra32 18...16 regimm rt 20...19 01234567 0 bltz bgez bltzl bgezl * * * * 1 tgei tgeiu tlti tltiu teqi * tnei * 2 bltzal bgezal bltzall bgezall * * * * 3********
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 524 figure 17-1. cpu instruction opcode bit encoding (2/2) 23...21 copz rs 25, 2401234567 0mf dmf cf mt dmt ct 1bc ?????? 2co 3 18...16 copz rt 20...19 01234567 0 bcf bct bcfl bctl ??? 1 ??????? 2 ??????? 3 ??????? 2...0 cp0 function 5...3 01234567 0 tlbr tlbwi ?? tlbwr 1tlbp ?????? 2 ??????? 3eret ?????? 4 wait ?????? 5 ??????? 6 ??????? 7 ??????? 2...0 special2 function 5...3 01234567 0 madd maddu mul64 msub msubu ? 1 ??????? 2 ??????? 3 ??????? 4clzclo ? dclz dclo ? 5 ??????? 6 ??????? 7 ???????
chapter 17 cpu instruction set preliminary user?s manual u16044ej1v0um 525 remark the meanings of the symbols in the above figures are as follows. *: operation codes marked with an asterisk cause reserved instruction exceptions in current v r 5500 implementations and are reserved for future versions of the architecture. : operation codes marked with a gamma cause a reserved instruction exception. they are reserved for future versions of the architecture. : operation codes marked with a delta are valid only for processors in which cp0 is enabled, and cause a reserved instruction exception in other processors. : operation codes marked with a chi are valid only in the v r 4000 and v r 5000 series. : operation codes marked with an epsilon are valid when the processor operates in 64-bit mode or 32-bit kernel mode. these instructions will cause a reserved instruction exception when the processor operates in 32-bit user/supervisor mode. : operation codes marked with a pi are also used in instructions that were added to the v r 5500, such as the sum-of-products operation and rotate instructions. : operation codes marked with a xi are valid only in the v r 5500. : operation codes marked with a rho are valid only for operation in kernel mode or for processors in which cp0 is enabled. these instructions will cause a coprocessor unusable exception when the processor operates in 32-bit user/supervisor mode or in processors in which cp0 is disabled.
preliminary user?s manual u16044ej1v0um 526 chapter 18 fpu instruction set this chapter outlines the floating-point instructions (fpu instructions) and explains the function of each instruction. 18.1 type of instruction the fpu instructions are classified into the following three basic types. ? i type (immediate type) instructions, such as load and store instructions ? r type (register type) instructions, such as floating-point operation instructions using two or three registers ? other instructions, such as branch and transfer instructions the floating-point instructions are mapped to the mips coprocessor instructions. the mips architecture defines coprocessor 1 (cp1) as a floating-point unit.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 527 the instruction types used for the load/store instructions are shown in figure 18-1. figure 18-1. load/store instruction format i type (immediate) 31 26 20 16 15 0 25 21 op base ft 6 55 16 offset r type (register) 31 26 20 16 15 11 10 6 5 0 25 21 cop1x base index 6 55 6 5 5 0 fd function 31 26 20 16 15 11 10 6 5 0 25 21 cop1x base index 6 55 6 5 5 fs 0 function op, cop1x 6-bit opcode base 5-bit base register specifier index 5-bit index register specifier ft 5-bit source (for store) or destination (for load) fpu register specifier fs 5-bit source fpu register specifier fd 5-bit destination fpu register specifier offset 16-bit offset of signed immediate function 6-bit function field the r type load/store instructions (register + register addressing mode) have been added to the mips iv instruction set. all the load/store instructions of the coprocessor reference data aligned at the word boundary. therefore, the access type area of a word load/store instruction is always word, and the lower 2 bits of the address are always 0. the access type area of a doubleword load/store instruction is always doubleword and the lower 3 bits of the address are always 0. the byte in the accessed field that has the lowest byte address is specified as the address regardless of the byte order (endian). in a big-endian system, this byte is the leftmost byte. it is the rightmost byte in a little-endian system.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 528 figure 18-2 shows the instruction format of r type instructions used for operation instructions. figure 18-2. operation instruction format r type (register) 31 26 20 16 15 11 10 6 5 0 25 21 cop1 fmt ft 6 55 6 5 5 fs fd function 31 26 20 16 15 11 10 6 5 0 25 21 cop1x fr ft 6 55 3 5 5 fs fd function 32 3 fmt cop1 , cop1x 6-bit opcode fmt 5-bit or 3-bit format specifier fs 5-bit source 1 register ft 5-bit source 2 register fr 5-bit source 3 register fd 5-bit destination register function 6-bit or 3-bit function field many formats can be applied to the floating-point instructions. the operand format of an instruction is specified by a 5-bit or 3-bit fmt field. the code of this field is shown in table 18-1. table 18-1. format field code fmt(4:0) fmt(2:0) mnemonic size format 0 to 15 ? reserved 16 0 s single precision (32 bits) binary floating point 17 1 d double precision (64 bits) binary floating point 18 2 reserved 19 3 reserved 20 4 w 32 bits binary fixed point 21 5 l 64 bits binary fixed point 22 to 31 6, 7 reserved the function field indicates the floating-point operation to be executed.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 529 18.1.1 data format each operation is valid only in a specific data format. for execution, these formats and several operations are supported by emulation. however, valid combinations (those marked ? v ? in table 18-2) must be supported. combinations marked ? r ? in table 18-2 are not defined by this architecture at present and cause an unimplemented operation exception. these combinations are reserved for future expansion of the architecture. table 18-2. valid format of fpu instruction source format operation single double word long word add v v r r sub v v r r mul v v r r div v v r r sqrt v v r r abs v v r r mov v v neg v v r r trunc.l v v round.l v v ceil.l v v floor.l v v trunc.w v v round.w v v ceil.w v v floor.w v v cvt.s vvv cvt.d v v v cvt.w v v cvt.l v v cvvrr remark v: valid r: reserved
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 530 18.2 instruction notation conventions in this chapter, all variable subfields in an instruction format (such as fs , ft , immediate , etc.) are shown in lowercase names. the instruction names (e.g. add and sub) are indicated by upper-case characters. for the sake of clarity, we sometimes use an alias for a variable subfield in the formats of specific instructions. for example, we use base instead of fs in the format for load and store instructions. such an alias is always lower case, since it refers to a variable subfield. the two subfields op and function of some instructions are 6-bit fixed values. these subfields are indicated by upper-case mnemonics. for example, the floating-point add instruction uses op = cop1 and function = add. in the other cases, both uppercase and lowercase characters are used because a constant area and a variable area exist in one area together. the architecture level at which the instruction was defined first is indicated on the right of the instruction format. the product name is also shown for instructions that may be incorporated differently depending on the product. figures with the actual bit encoding for all the mnemonics and the function field are located at the end of this chapter ( 18.5 fpu instruction opcode bit encoding ), and the bit encoding also accompanies each instruction. in the instruction descriptions that follow, the operation section describes the operation performed by each instruction using a high-level language notation. special symbols used in the notation are described in table 17-1 cpu instruction operation notaions . the following examples illustrate the application of some of the instruction notation conventions. example 1: gpr [rt] immediate || 0 16 sixteen zero bits are concatenated with an immediate value (typically 16 bits), and the 32-bit string is assigned to general-purpose register rt . example 2: (immediate 15 ) 16 || immediate 15...0 bit 15 (the sign bit) of an immediate value is extended for 16-bit positions, and the result is concatenated with bits 15 to 0 of the immediate value to form a 32-bit sign extended value. example 3: cpr [1, ft] data assign data to general-purpose register ft of cp1, i.e., floating-point general-purpose register fgr. the terms fgr and fpr are used in the explanation of each instruction. fgr means 32 fpu floating-point general-purpose registers fgr0 to fgr31, and fpr means the floating-point registers of fpus. the load/store instructions, and instructions that transfer data with the cpu use fgrs (may be described as cpr in some cases). the transfer instructions, operation instructions, and conversion instructions in cp1 use the fpr. ? when the fr bit (bit 26) of the status register is 0, only even fprs are valid, and all the 32 fgrs are 32 bits wide. ? when the fr bit (bit 26) of the status register is 1, both odd and even fprs are valid, and all the 32 fgrs are 64 bits wide.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 531 to get an fpr value, or to change the value of an fgr, the following routine is used in the description of a floating-point operation. (1) 32-bit mode value <- valuefpr (fpr, fmt) /* undefined for odd fpr */ case fmt of s, w: value <- fgr[fpr+0] d: value <- fgr[fpr+1] ii fgr[fpr+0] end storefpr (fpr, fmt, value): /* undefined for odd fpr */ case fmt of s, w: fgr[fpr+1] <- undefined fgr[fpr+0] <- value d: fgr[fpr+1] <- value 63...32 fgr[fpr+0] <- value 31...0 end (2) 64-bit mode value <- valuefpr (fpr, fmt) case fmt of s, w: value <- fgr[fpr] 31...0 d, l: value <- fgr[fpr] end storefpr (fpr, fmt, value): case fmt of s, w: fgr[fpr] <- undefined 32 ii value d, l: fgr[fpr] <- value end
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 532 18.3 cautions on using fpu instructions 18.3.1 load and store instructions all data transfers between the floating-point unit (fpu) and memory are executed by coprocessor load/store instructions. these instructions reference the general-purpose registers of the fpu. these instructions do not convert formats as they are independent of data formats. therefore, a floating-point exception does not occur even if these instructions are executed. data can be directly transferred between the fpu and processor by using the mtc or mfc instruction. like the floating-point load/store instructions, these instructions do not convert formats; therefore, a floating-point exception does not occur. five floating-point control registers can be used as the registers of the fpu. only the ctc1 and cfc1 instructions are supported for these registers. an instruction immediately after the load instruction can reference the contents of the register that has been loaded, but execution of that instruction may be delayed. although the v r 5500 can cover the load delay with an out- of-order mechanism, scheduling the load delay slot is recommended to improve the performance. the operation of the load/store instruction differs depending on the bit width of the floating-point general-purpose register (fgr), as follows. ? when the fr bit of the status register is 0 the fgr is 32 bits wide. the sixteen even registers of the 32 fgrs can be accessed to hold single-precision floating-point data. to hold double-precision floating-point data, sixteen data items can be held by using an even register to hold the lower bits of the data and an odd register to hold the higher bits. ? when the fr bit of the status register is 1 the fgr is 64 bits wide. the lower bits of the 32 fgrs are accessed to hold single-precision floating-point data. to hold double-precision floating-point data, the 32 fgrs are accessed. in the load and store descriptions, the functions listed below are used to summarize the handling of virtual addresses and physical memory. table 18-3. load and store common functions function meaning addresstranslation uses the tlb to find the physical address given the virtual address. the function fails and a tlb refill exception occurs if the required translation is not present in the tlb. loadmemory searches the specified data length (doubleword, word) containing the specified physical address in the cache and main memory and loads the contents. if the cache is enabled for this access, the contents are loaded to the cache. storememory searches the contents of the specified data length (doubleword, word) in the cache, write buffer, and main memory and stores the contents in the specified physical address.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 533 18.3.2 floating-point operation instructions the operation instructions include all the floating-point operations executed by the fpu. the instruction set of the fpu includes the following instructions. ? floating-point addition ? floating-point subtraction ? floating-point multiplication ? floating-point division ? floating-point square root ? floating-point reciprocal ? reciprocal of floating-point square root ? conversion between fixed-point and floating-point formats ? conversion between floating-point formats ? floating-point comparison these instructions conform to ieee standard 754 to ensure accuracy. the result of an operation is the same as the result of infinite accuracy that is rounded in a specific format by using the rounding mode at that time. the operand format must be specified for an instruction. all the instructions, except the conversion instructions, cannot execute operations in different formats.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 534 18.3.3 fpu branch instruction the fpu branch instruction can be used with the logic of its conditions inverted. therefore, only 16 comparisons are necessary for all 32 conditions, as shown in table 18-4. the 4-bit condition code of a floating-point comparison instruction specifies a condition in the ? true ? column of this table. to invert the logic of the condition for the fpu branch instruction, the condition in the ? false ? column of this table is applied. if not a number (nan) is specified as an operand, the result of comparing a numeric value other nan is ? unordered ? because the numeric great-and-small relationship cannot be established. table 18-4. logical inversion of term depending on true/false of condition condition relationship mnemonic true faulse code greater than less than equal to unordered occurrence of invalid operation exception in case of unordered ft0ffffdoes not occur unor1ffftdoes not occur eqneq2fftfdoes not occur ueqogl3ffttdoes not occur oltuge4ftffdoes not occur ultoge5ftftdoes not occur oleugt6fttfdoes not occur uleogt7ftttdoes not occur sfst8ffffoccurs nglegle9ffftoccurs seqsne10fftfoccurs nglgl11ffttoccurs ltnlt12ftffoccurs ngege13ftftoccurs lenle14fttfoccurs ngtgt15ftttoccurs remark f: false t: true 18.4 fpu instruction this section describes the functions of fpu instructions in detail in alphabetical order. the exception that may occur by executing each instruction is shown in the last of each instruction's description. for details of exceptions and their processes, see chapter 8 floating-point exceptions .
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 535 abs.fmt floating-point absolute value 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 abs fs fmt 010001 00000 000101 format: abs.s fd, fs mips i abs.d fd, fs purpose: calculates the absolute value of a floating-point value. description: this instruction calculates the absolute value of the contents of floating-point register fs and stores the result in floating-point register fd . the operand is processed as floating-point format fmt. the absolute value is arithmetically calculated. if the operand is nan, therefore, an invalid operation exception occurs. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, absolutevalue (valuefpr (fs, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 536 add.fmt floating-point add 26 31 cop1 0 fd 25 21 20 16 15 11 10 6 5 add fs fmt 010001 000000 ft format: add.s fd, fs, ft mips i add.d fd, fs, ft purpose: adds floating-point values. description: this instruction adds the contents of floating-point register fs to the contents of floating-point register ft , and stores the result in floating-point register fd . the operands are processed as floating-point format fmt . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt) + valuefpr (ft, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 537 bc1f branch on fpu false (coprocessor 1) (1/2) 26 31 cop1 0 25 bc 21 20 16 15 17 nd cc 010001 01000 0 18 tf 0 offset format: bc1f offset mips i bc1f cc, offset mips iv purpose: tests the floating-point condition code and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is false (0), execution branches to a branch address with a delay of one instruction. the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). nd specifies whether the instruction in the branch delay slot is discarded if the branch condition is not satisfied. tf specifies which is used as the branch condition, true or false. the values of nd and tf are fixed for each instruction. the mips i instruction set architecture provides only 1 bit of a floating-point condition code: the c bit in fcr31. therefore, the cc field of the mips i, ii, and iii instruction set architectures must be 0. the mips iv instruction set architecture has seven additional condition code bits. the floating-point comparison instruction and conditional branch instruction specify the condition code bits to be set or tested. both the assembler formats are valid with the mips iv instruction set architecture. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 538 bc1f branch on fpu false (coprocessor 1) (2/2) operation: mips i, ii, iii 32 t ? 1: condition fpconditioncode(0) = 0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target endif 64 t ? 1: condition fpconditioncode(0) = 0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target endif mips iv 32 t ? 1: condition fpconditioncode(cc) = 0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target end if 64 t ? 1: condition fpconditioncode(cc) = 0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target end if exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 539 bc1fl branch on fpu false likely (coprocessor 1) (1/2) 26 31 cop1 0 25 bc 21 20 16 15 17 nd cc 010001 01000 1 18 tf 0 offset format: bc1fl offset mips ii bc1fl cc, offset mips iv purpose: tests the floating-point condition code and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is false (0), execution branches to a branch address with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). nd specifies whether the instruction in the branch delay slot is discarded if the branch condition is not satisfied. tf specifies which is used as the branch condition, true or false. the values of nd and tf are fixed for each instruction. the mips i instruction set architecture provides only 1 bit of a floating-point condition code: the c bit in fcr31. therefore, the cc field of the mips i, ii, and iii instruction set architectures must be 0. the mips iv instruction set architecture has seven additional condition code bits. the floating-point comparison instruction and conditional branch instruction specify the condition code bits to be set or tested. both the assembler formats are valid with the mips iv instruction set architecture. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bc1f instruction.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 540 bc1fl branch on fpu false likely (coprocessor 1) (2/2) operation: mips ii, iii 32 t ? 1: condition fpconditioncode(0) = 0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction endif 64 t ? 1: condition fpconditioncode(0) = 0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction endif mips iv 32 t ? 1: condition fpconditioncode(cc) = 0 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction end if 64 t ? 1: condition fpconditioncode(cc) = 0 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction end if exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 541 bc1t branch on fpu true (coprocessor 1) (1/2) 26 31 cop1 0 25 bc 21 20 16 15 17 nd cc 010001 01000 0 18 tf 1 offset format: bc1t offset mips i bc1t cc, offset mips iv purpose: tests the floating-point condition code and executes a pc relative condition branch. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is true (1), execution branches to a branch address with a delay of one instruction. the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). nd specifies whether the instruction in the branch delay slot is discarded if the branch condition is not satisfied. tf specifies which is used as the branch condition, true or false. the values of nd and tf are fixed for each instruction. the mips i instruction set architecture provides only 1 bit of a floating-point condition code: the c bit in fcr31. therefore, the cc field of the mips i, ii, and iii instruction set architectures must be 0. the mips iv instruction set architecture has seven additional condition code bits. the floating-point comparison instruction and conditional branch instruction specify the condition code bits to be set or tested. both the assembler formats are valid with the mips iv instruction set architecture. remark the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 542 bc1t branch on fpu true (coprocessor 1) (2/2) operation: mips i, ii, iii 32 t ? 1: condition fpconditioncode(0) = 1 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target endif 64 t ? 1: condition fpconditioncode(0) = 1 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target endif mips iv 32 t ? 1: condition fpconditioncode(cc) = 1 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target end if 64 t ? 1: condition fpconditioncode(cc) = 1 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target end if exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 543 bc1tl branch on fpu true likely (coprocessor 1) (1/2) 26 31 cop1 0 25 bc 21 20 16 15 17 nd cc 010001 01000 1 18 tf 1 offset format: bc1tl offset mips ii bc1tl cc, offset mips iv purpose: tests the floating-point condition code and executes a pc relative condition branch. executes a delay slot only when a given branch condition is satisfied. description: a branch target address is computed from the sum of the address of the instruction in the delay slot and the 16-bit offset , shifted left two bits and sign-extended. if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is true (1), execution branches to a branch address with a delay of one instruction. if the conditional branch is not taken, the instruction in the branch delay slot is discarded. the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). nd specifies whether the instruction in the branch delay slot is discarded if the branch condition is not satisfied. tf specifies which is used as the branch condition, true or false. the values of nd and tf are fixed for each instruction. the mips i instruction set architecture provides only 1 bit of a floating-point condition code: the c bit in fcr31. therefore, the cc field of the mips i, ii, and iii instruction set architectures must be 0. the mips iv instruction set architecture has seven additional condition code bits. the floating-point comparison instruction and conditional branch instruction specify the condition code bits to be set or tested. both the assembler formats are valid with the mips iv instruction set architecture. remarks 1. the condition branch range of this instruction is 128 kb because an 18-bit signed offset is used. to branch to an address outside this range, use the j or jr instruction. 2. use this instruction only when it is expected with a high probability (98% or higher) that a given branch condition is satisfied. if the branch condition is not satisfied or if the branch destination is not known, use the bc1t instruction.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 544 bc1tl branch on fpu true likely (coprocessor 1) (2/2) operation: mips ii, iii 32 t ? 1: condition fpconditioncode(0) = 1 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction endif 64 t ? 1: condition fpconditioncode(0) = 1 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction endif mips iv 32 t ? 1: condition fpconditioncode(cc) = 1 t: target (offset 15 ) 14 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction end if 64 t ? 1: condition fpconditioncode(cc) = 1 t: target (offset 15 ) 46 || offset || 0 2 t + 1: if condition then pc pc + target else nulifycurrentinstruction end if exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 545 c.cond.fmt floating-point compare (1/3) 26 31 cop1 0 cc 25 21 20 16 15 11 10 6 5 fc n o t e fs fmt 010001 11 ft 43 cond n o t e 87 0 00 format: c.cond.s fs, ft mips i c.cond.d fs, ft c.cond.s cc, fs, ft mips iv c.cond.d cc, fs, ft purpose: compares floating-point values and records the boolean result of the comparison in a condition code. description: this instruction compares the contents of floating-point register fs with the contents of floating-point register ft in accordance with comparison condition cond , and sets the result in the condition code bit ( cc bit) of the floating- point control register (fcr31 or fcr25) specified by cc . the operands are processed as floating-point format fmt . if one of the values is nan and if the most significant bit of comparison condition cond is set, an invalid operation exception occurs. if this exception occurs, the flag bits of fcr31 and fcr26 are set. if the invalid operation exception is enabled (if the enable bits of fcr31 and fcr28 are set), the comparison result is not set, and processing of the exception is started as is. if the enable bits are not set, only the comparison result is set to the cc bit, and the exception is not processed. the comparison result is also used to test the fpu branch instruction. comparison is executed accurately, and neither overflow nor underflow occurs. one of four mutually exclusive relations, ? less than ? , ? equal to ? , ? greater than ? , and ? unordered (comparison impossible) ? , occurs. if one or both the operands are nan, the result of the comparison is always ? unordered ? . for details of comparison condition cond , refer to table 18-4 logical inversion of term depending on true/false of condition . the sign of 0 is ignored during comparison (+0 = ? 0). this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. the mips i instruction set architecture provides only 1 bit of a floating-point condition code: the c bit in fcr31. therefore, the cc field of the mips i, ii, and iii instruction set architectures must be 0. the mips iv instruction set architecture has seven additional condition code bits. the floating-point comparison instruction and conditional branch instruction specify the condition code bits to be set or tested. both the assembler formats are valid with the mips iv instruction set architecture. note see 18.5 fpu instruction opcode bit encoding .
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 546 c.cond.fmt floating-point compare (2/3) if a floating-point operation instruction, including a comparison instruction, receives signalingnan (snan), it is regarded as an invalid operation condition. if comparison that also becomes an invalid operation with quietnan (qnan), not only with snan, is used, a program that generates an error if nan is used can be made easy. consequently, a code that clearly checks qnan that makes the result unordered is unnecessary. instead, an exception occurs if an invalid operation is detected, and errors are processed by an exception processing system. the case of comparison in which two numeric values are checked if they are equal to each other, and an error is detected if the result is unordered, is shown below. # to test qnan clearly c.eq.d $f2, $f4 # checks if two values are equal nop bc1t l2 # to l2 if not equal c.un.d $f2, $f4 # checks if result is unordered if not equal bc1t error # to error processing if unordered # describes processing code if not equal # describes processing code if equal l2: : # to use comparison that reports qnan c.seq.d $f2, $f4 # checks if two values are equal nop bc1t l2 # to l2 if equal nop # describes processing code if result is not unordered # describes processing code if not equal # describes processing code if equal l2: :
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 547 c.cond.fmt floating-point compare (3/3) operation: 32, 64 t: if nan (valuefpr (fs, fmt)) or nan (valuefpr (ft, fmt)) then less false equal false unordered true if cond3 then signal invalidoperationexception endif else less valuefpr (fs, fmt) < valuefpr (ft, fmt) equal valuefpr (fs, fmt) = valuefpr (ft, fmt) unordered false endif condition (cond2 and less) or (cond1 and equal) or (cond0 and unordered) setfpconditioncode (cc, condition) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 548 ceil.l.fmt floating-point ceiling to long fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 ceil. l fs fmt 010001 00000 001010 format: ceil.l.s fd, fs mips iii ceil.l.d fd, fs purpose: rounds up a floating-point value to a 64-bit fixed-point value for conversion. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit floating-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of + regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 63 ? 1 to ? 2 63 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 63 ? 1 is returned. this operation is defined in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: storefpr (fd, l, convertfmt (valuefpr (fs, fmt), fmt, l)) remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception floating-point operation exception reserved instruction exception (32-bit user/supervisor mode) floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 549 ceil.l.fmt floating-point ceiling to long fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ?2 53 (0xffe0 0000 0000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 550 ceil.w.fmt floating-point ceiling to single fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 ceil. w fs fmt 010001 00000 001110 format: ceil.w.s fd, fs mips ii ceil.w.d fd, fs purpose: rounds up a floating-point value to a 32-bit fixed-point value for conversion. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit floating-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of + regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 31 ? 1 to ? 2 31 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 31 ? 1 is returned. operation: 32, 64 t: storefpr (fd, w, convertfmt (valuefpr (fs, fmt), fmt, w)) exceptions: coprocessor unusable exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 551 ceil.w.fmt floating-point ceiling to single fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 31 ? 1 (0x7fff ffff) to ? 2 31 (0x8000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 552 cfc1 move control word from fpu (coprocessor 1) 26 31 cop1 010001 0 fs 25 rt 21 20 16 15 11 10 0 00000000000 cf 00010 format: cfc1 rt, fs mips i purpose: copies a word from a fpu control register to a general-purpose register. description: this instruction loads the contents of floating-point control register fs to general-purpose register rt of the cpu. this instruction is defined only if fs is 0, 25, 26, 28, or 31. otherwise, the result will be undefined. remark of the floating-point control registers, fcr25, fcr26, and fcr28 are provided in the v r 5500. therefore, these registers cannot be specified as fs with the mips i, ii, iii, and iv instruction set architectures. operation: 32 t: temp fcr[fs] t + 1: gpr[rt] temp 64 t: temp fcr[fs] t + 1: gpr[rt] (temp 31 ) 32 || temp exceptions: coprocessor unusable exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 553 ctc1 move control word to fpu (coprocessor 1) 26 31 cop1 010001 0 fs 25 rt 21 20 16 15 11 10 0 00000000000 ct 00110 format: ctc1 rt, fs mips i purpose: copies a word from a general-purpose register to a fpu control register. description: this instruction loads the contents of general-purpose register rt of the cpu to floating-point control register fs . this instruction is defined only if fs is 0, 25, 26, 28, or 31. otherwise, the result will be undefined. if the cause bit of this register and corresponding enable bit are set by writing data to the control/status register (fcr31), a floating-point operation exception occurs. write data to the register before the exception occurs. remark of the floating-point control registers, fcr25, fcr26, and fcr28 are provided in the v r 5500. therefore, these registers cannot be specified as fs with the mips i, ii, iii, and iv instruction set architectures. operation: 32 t: temp gpr[rt] t + 1: fcr[fs] temp 64 t: temp gpr[rt] 31..0 t + 1: fcr[fs] temp exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception division-by-zero exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 554 cvt.d.fmt floating-point convert to double floating-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 cvt. d fs fmt 010001 00000 100001 format: cvt.d.s fd, fs mips i cvt.d.w fd, fs mips iii cvt.d.l fd, fs purpose: converts a floating-point value or fixed-point value into a double-precision floating-point value. description: this instruction arithmetically converts the contents of floating-point register fs into a double-precision floating- point format in accordance with the current rounding mode, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . this instruction is valid only when converting from a single-precision floating-point format or from a 32-bit or 64-bit fixed-point format. this conversion operation is executed accurately, without the accuracy affected, in the single-precision floating- point format and 32-bit fixed-point format. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, d, convertfmt (valuefpr (fs, fmt), fmt, d)) exceptions: coprocessor unusable exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 555 cvt.d.fmt floating-point convert to double floating-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the unimplemented operation exception occurs if conversion is executed when the format of the source operand is outside the range of 2 55 ? 1 (0x007f ffff ffff ffff) to ? 2 55 (0xff80 0000 0000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 556 cvt.l.fmt floating-point convert to long fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 cvt. l fs fmt 010001 00000 100101 format: cvt.l.s fd, fs mips iii cvt.l.d fd, fs purpose: converts a floating-point value into a 64-bit fixed-point value. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit floating-point format in accordance with the current rounding mode, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 63 ? 1 to ? 2 63 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 63 ? 1 is returned. this operation is defined in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: storefpr (fd, l, convertfmt (valuefpr (fs, fmt), fmt, l)) remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception floating-point operation exception reserved instruction exception (32-bit user/supervisor mode) floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 557 cvt.l.fmt floating-point convert to long fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ?2 53 (0xffe0 0000 0000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 558 cvt.s.fmt floating-point convert to single floating-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 cvt. s fs fmt 010001 00000 100000 format: cvt.s.d fd, fs mips i cvt.s.w fd, fs mips iii cvt.s.l fd, fs purpose: converts a floating-point value or fixed-point value into a single-precision floating-point value. description: this instruction arithmetically converts the contents of floating-point register fs into a single-precision floating- point format in accordance with the current rounding mode, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded in accordance with the current rounding mode. this instruction is valid only when converting from a double-precision floating-point format or from a 32-bit or 64- bit fixed-point format. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, s, convertfmt (valuefpr (fs, fmt), fmt, s)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 559 cvt.s.fmt floating-point convert to single floating-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the unimplemented operation exception occurs if conversion is executed when the format of the source operand is outside the range of 2 55 ? 1 (0x007f ffff ffff ffff) to ? 2 55 (0xff80 0000 0000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 560 cvt.w.fmt floating-point convert to single fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 cvt. w fs fmt 010001 00000 100100 format: cvt.w.s fd, fs mips i cvt.w.d fd, fs purpose: converts a floating-point value into a 32-bit fixed-point value. description: this instruction arithmetically converts the contents of floating-point register fs into a 32-bit floating-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 31 ? 1 to ? 2 31 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 31 ? 1 is returned. operation: 32, 64 t: storefpr (fd, w, convertfmt (valuefpr (fs, fmt), fmt, w)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 561 cvt.w.fmt floating-point convert to single fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 31 ? 1 (0x7fff ffff) to ? 2 31 (0x8000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 562 div.fmt floating-point divide 26 31 cop1 0 fd 25 21 20 16 15 11 10 6 5 div fs fmt 010001 000011 ft format: div.s fd, fs, ft mips i div.d fd, fs, ft purpose: divides a floating-point value. description: this instruction divides the contents of floating-point register fs by the contents of floating-point register ft , and stores the result in floating-point register fd . the operand is processed as floating-point format fmt . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt) / valuefpr (ft, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception division-by-zero exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 563 dmfc1 doubleword move from fpu (coprocessor 1) 26 31 cop1 0 25 dmf 21 20 16 15 11 10 0 fs rt 010001 00001 00000000000 format: dmfc1 rt, fs mips iii purpose: copies a doubleword from a floating-point register to a general-purpose register. description: this instruction loads the contents of floating-point general-purpose register fs to general-purpose register rt of the cpu. the fr bit of the status register indicates that all the 32 registers of the processor can be specified or not. if the fr bit is 0 and if the least significant bit of fs is 1, this instruction is undefined. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this operation is defined in 64-bit mode or in 32-bit kernel mode. operation: 64 t: if sr 26 = 1 then data fgr [fs] else if fs 0 = 0 then data fgr [fs+1] || fgr[fs] else data undefined 64 endif t + 1: gpr [rt] data remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception reserved instruction exception (32-bit user/supervisor mode)
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 564 dmtc1 doubleword move to fpu (coprocessor 1) 26 31 cop1 0 25 dmt 21 20 16 15 11 10 0 fs rt 010001 00101 00000000000 format: dmtc1 rt, fs mips iii purpose: copies a doubleword from a general-purpose register to a floating-point register. description: this instruction loads the contents of general-purpose register rt of the cpu to floating-point general-purpose register fs . the fr bit of the status register indicates that all the 32 registers of the processor can be specified or not. if the fr bit is 0 and if the least significant bit of fs is 1, this instruction is undefined. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this operation is defined in 64-bit mode or in 32-bit kernel mode. operation: 64 t: data gpr [rt] t + 1: if sr 26 = 1 then fgr [fs] data else if fs 0 = 0 then fgr [fs+1] data 63..32 fgr [fs] data 31..0 else undefined_result endif remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception reserved instruction exception (32-bit user/supervisor mode)
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 565 floor.l.fmt floating-point floor to long fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 floor. l fs fmt 010001 00000 001011 format: floor.l.s fd, fs mips iii floor.l.d fd, fs purpose: rounds down a floating-point value to a 64-bit fixed-point value for conversion. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit floating-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of ? regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 63 ? 1 to ? 2 63 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 63 ? 1 is returned. this operation is defined in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: storefpr (fd, l, convertfmt (valuefpr (fs, fmt), fmt, l)) remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception floating-point operation exception reserved instruction exception (32-bit user/supervisor mode) floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 566 floor.l.fmt floating-point floor to long fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ? 2 53 (0xffe0 0000 0000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 567 floor.w.fmt floating-point floor to single fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 floor. w fs fmt 010001 00000 001111 format: floor.w.s fd, fs mips ii floor.w.d fd, fs purpose: rounds down a floating-point value to a 32-bit fixed-point value for conversion. description: this instruction arithmetically converts the contents of floating-point register fs into a 32-bit floating-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of ? regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 31 ? 1 to ? 2 31 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 31 ? 1 is returned. operation: 32, 64 t: storefpr (fd, w, convertfmt (valuefpr (fs, fmt), fmt, w)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exception: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 568 floor.w.fmt floating-point floor to single fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 31 ? 1 (0x7fff ffff) to ? 2 31 (0x8000 0000).
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 569 ldc1 load doubleword to fpu (coprocessor 1) (1/2) 26 31 ldc1 0 25 21 20 16 15 ft 110101 offset base format: ldc1 rt, offset (base) mips ii purpose: loads a doubleword from memory to a floating-point register. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. if the fr bit of the status register is 0, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point registers ft and ft + 1. at this time, the higher 32 bits of the doubleword are stored in the odd-numbered register specified by ft + 1, and the lower 32 bits are stored in the even-numbered register specified by ft . if the least significant bit of the ft field is not 0, the operation is undefined. if the fr bit is1, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point register ft . an address error exception occurs if the lower 3 bits of the address are not 0.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 570 ldc1 load doubleword to fpu (coprocessor 1) (2/2) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data loadmemory (uncached, doubleword, paddr, vaddr, data) if sr 26 = 1 then fgr [ft] data elseif ft 0 = 0 then fgr [ft+1] data 63..32 fgr [ft] data 31..0 else undefined_result endif 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data loadmemory (uncached, doubleword, paddr, vaddr, data) if sr 26 = 1 then fgr [ft] data elseif ft 0 = 0 then fgr [ft+1] data 63..32 fgr [ft] data 31..0 else undefined_result endif exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 571 ldxc1 load doubleword indexed to fpu (coprocessor 1) 26 31 cop1x 010011 0 fd 25 index 21 20 16 15 11 10 6 5 ldxc1 000001 0 00000 base format: ldxc1 fd, index (base) mips iv purpose: loads a doubleword from memory to a floating-point register (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. if the fr bit of the status register is 0, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point registers fd and fd + 1. at this time, the higher 32 bits of the doubleword are stored in the odd-numbered register specified by fd + 1, and the lower 32 bits are stored in the even-numbered register specified by fd . if the least significant bit of the fd field is not 0, the operation is undefined. if the fr bit is1, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point register fd . the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . an address error exception occurs if the lower 3 bits of the virtual address are not 0. operation: 32, 64 t: vaddr gpr[base]+gpr[index] (paddr, cca) address translation (vaddr, data) data loadmemory (cca, doubleword, paddr, vaddr, data) if sr 26 = 1 then fgr[fd] data elseif fd 0 = 0 then fgr[fd+1] data 63..32 fgr[fd] data 31..0 else undefined_result endif exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 572 luxc1 load doubleword indexed unaligned to fpu (coprocessor 1) (1/2) 26 31 cop1x 010011 0 fd 25 index 21 20 16 15 11 10 6 5 luxc1 000101 0 00000 base format: luxc1 fd, index (base) mips v purpose: loads a doubleword from memory to a floating-point register (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. the lower 3 bits of the virtual address are masked by 0. therefore, an address error exception does not occur even if the lower 3 bits of the virtual address are not 0. if the fr bit of the status register is 0, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point registers fd and fd + 1. at this time, the higher 32 bits of the doubleword are stored in the odd-numbered register specified by fd + 1, and the lower 32 bits are stored in the even-numbered register specified by fd . if the least significant bit of the fd field is not 0, the operation is undefined. if the fr bit is1, the contents of the doubleword at the memory position specified by the virtual address are loaded to floating-point register fd . the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . operation: 32, 64 t: vaddr (gpr[base]+gpr[index]) 63..3 || 0 3 (paddr, cca) address translation (vaddr, data) data loadmemory (cca, doubleword, paddr, vaddr, data) if sr 26 = 1 then fgr[fd] data elseif fd 0 = 0 then fgr[fd+1] data 63..32 fgr[fd] data 31..0 else undefined_result endif
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 573 luxc1 load doubleword indexed unaligned to fpu (coprocessor 1) (2/2) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 574 lwc1 load word to fpu (coprocessor 1) (1/2) 26 31 lwc1 0 25 21 20 16 15 ft 110001 offset base format: lwc1 ft, offset (base) mips i purpose: loads a word from memory to a floating-point register. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. the contents of the word at the memory position specified by the virtual address are loaded to floating-point register ft . if the fr bit of the status register is 0 and if the least significant bit of the ft field is 0, the contents of the word are stored in the lower 32 bits of floating-point register ft . if the least significant bit of the ft field is 1, the contents of the word are stored in the higher 32 bits of floating-point register ft ? 1. if the fr bit is 1, all the 64-bit floating-point registers can be accessed. therefore, the contents of the word are stored in floating-point register ft . the values of the higher 32 bits are undefined. an address error exception occurs if the lower 2 bits of the address are not 0.
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 575 lwc1 load word to fpu (coprocessor 1) (2/2) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data loadmemory (uncached, word, paddr, vaddr, data) if sr 26 = 1 then fgr [ft] undefined 32 || data else fgr [ft] data endif 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data loadmemory (uncached, word, paddr, vaddr, data) if sr 26 = 1 then fgr [ft] undefined 32 || data else fgr [ft] data endif exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user ? s manual u16044ej1v0um 576 lwxc1 load word indexed to fpu (coprocessor 1) 26 31 cop1x 010011 0 fd 25 index 21 20 16 15 11 10 6 5 lwxc1 000000 0 00000 base format: lwxc1 fd, index (base) mips iv purpose: loads a word from memory to a floating-point register (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. the contents of the word at the memory position specified by the virtual address are loaded to floating-point register fd . if the fr bit of the status register is 0 and if the least significant bit of the fd field is 0, the contents of the word are stored in the lower 32 bits of floating-point register fd . if the least significant bit of the fd field is 1, the contents of the word are stored in the higher 32 bits of floating-point register fd ? 1. if the fr bit is1, the contents of the word at the memory position specified by the virtual address are stored in floating-point register fd . the values of the higher 32 bits are undefined. the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . an address error exception occurs if the lower 2 bits of the virtual address are not 0. operation: 32, 64 t: vaddr gpr[base] + gpr[index] (paddr, cca) address translation (vaddr, data) data loadmemory (cca, word, paddr, vaddr, data) if sr 26 = 1 then fgr[fd] undefined 32 || data elseif fd 0 = 0 then fgr[fd] fgr[fd] 63..32 || data else fgr[fd ? 1] data || fgr[fd ? 1] 31..0 endif exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 577 madd.fmt floating-point multiply-add 26 31 cop1x 010011 0 fs 25 ft 21 20 16 15 11 10 6 5 madd 100 fr fd fmt 32 format: madd.s fd, fr, fs, ft mips iv madd.d fd, fr, fs, ft purpose: combines multiplication and addition of floating-point values for execution. description: this instruction multiplies the contents of floating-point register fs by the contents of floating-point register ft , adds the contents of floating-point register fr to the result, and stores the result of the addition in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the condition of an exception is detected but the exception does not occur, the cause bit and flag bit of a floating-point control register are ored and the result is written to the flag bit. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fr, fmt) + valuefpr (fs, fmt) * valuefpr (ft, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception caution if the result of multiplication is a denormalized number, or an underflow or overflow occurs, an unimplemented operation exception actually occurs.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 578 mfc1 move word from fpu (coprocessor 1) 26 31 cop1 0 25 mf 21 20 16 15 11 10 0 fs rt 010001 00000 00000000000 format: mfc1 rt, fs mips i purpose: copies a word from a fpu (cp1) general-purpose register to a general-purpose register. description: this instruction loads the contents of floating-point general-purpose register fs to general-purpose register rt of the cpu. if the fr bit of the status register is 0 and if the least significant bit of fs is 0, the lower 32 bits of floating-point register fs are stored in general-purpose register rt . if the least significant bit of fs is 1, the higher 32 bits of floating-point register fs ? 1 are stored in general-purpose register rt . if the fr bit is 1, all the 64-bit floating-point registers can be accessed. therefore, the lower 32 bits of floating- point register fs are stored in general-purpose register rt . operation: 32 t: data fgr [fs] 31..0 t + 1: gpr [rt] data 64 t: data fgr [fs] 31..0 t + 1: gpr [rt] (data 31 ) 32 || data exceptions: coprocessor unusable exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 579 mov.fmt floating-point move 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 mov fs fmt 010001 00000 000110 format: mov.s fd, fs mips i mov.d fd, fs purpose: transfers a floating-point value between floating-point registers. description: this instruction stores the contents of floating-point register fs in floating-point register fd . the operand is processed as floating-point format fmt . this instruction is non-arithmetically executed and the ieee754 exception does not occur. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 580 movf move conditional on fpu false 26 31 special 000000 0 rd 25 cc 21 20 16 15 11 10 6 5 movci 000001 0 rs 18 17 0 tf 0 0 00000 format: movf rd, rs, cc mips iv purpose: tests a floating-point condition code and conditionally moves the contents of a general-purpose register. description: if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is false (0), the contents of cpu general-purpose register rs are stored in cpu general-purpose register rd . the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). tf specifies which is used as the branch condition, true or false. the value of tf is fixed for each instruction. operation: 32, 64 t: if fpconditioncode(cc) = 0 then gpr[rd] gpr[rs] endif exceptions: coprocessor unusable exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 581 movf.fmt floating-point move conditional on fpu false 26 31 cop1 010001 0 fs 25 cc 21 20 16 15 11 10 6 5 movcf 010001 0 fmt 18 17 0 tf 0 fd format: movf.s fd, fs, cc mips iv movf.d fd, fs, cc purpose: tests a floating-point condition code and conditionally moves a floating-point value. description: if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is false (0), the contents of floating-point register fs are stored in floating-point register fd . the source and destination operands are processed as floating-point format fmt . the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). tf specifies which is used as the branch condition, true or false. the value of tf is fixed for each instruction. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this instruction is non-arithmetically executed and the ieee754 exception does not occur. operation: 32, 64 t: if fpconditioncode(cc) = 0 then storefpr (fd, fmt, valuefpr (fs, fmt)) else storefpr (fd, fmt, valuefpr (fd, fmt)) endif exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 582 movn.fmt floating-point move conditional on not zero 26 31 cop1 010001 0 fd 25 rt 21 20 16 15 11 10 6 5 movn 010011 fmt fs format: movn.s fd, fs, rt mips iv movn.d fd, fs, rt purpose: tests the value of a general-purpose register and conditionally moves a floating-point value. description: if the contents of cpu general-purpose register rt are not 0, this instruction stores the contents of floating-point register fs in floating-point register fd . the source and destination operands are processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this instruction is non-arithmetically executed and the ieee754 exception does not occur. operation: 32, 64 t: if gpr[rt] 0 then storefpr (fd, fmt, valuefpr (fs, fmt)) else storefpr (fd, fmt, valuefpr (fd, fmt)) endif exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 583 movt move conditional on fpu true 26 31 special 000000 0 rd 25 cc 21 20 16 15 11 10 6 5 movci 000001 0 rs 18 17 0 tf 1 0 00000 format: movt rd, rs, cc mips iv purpose: tests a floating-point condition code and conditionally moves the contents of a general-purpose register. description: if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is true (1), the contents of cpu general-purpose register rs are stored in cpu general-purpose register rd . the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). tf specifies which is used as the branch condition, true or false. the value of tf is fixed for each instruction. operation: 32, 64 t: if fpconditioncode(cc) = 1 then gpr[rd] gpr[rs] endif exceptions: coprocessor unusable exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 584 movt.fmt floating-point move conditional on fpu true 26 31 cop1 010001 0 fs 25 cc 21 20 16 15 11 10 6 5 movcf 010001 0 fmt 18 17 0 tf 1 fd format: movt.s fd, fs, cc mips iv movt.d fd, fs, cc purpose: tests a floating-point condition code and conditionally moves a floating-point value. description: if the condition code bit ( cc bit) of the floating-point control register (fcr31 or fcr25) specified by cc is true (1), the contents of floating-point register fs are stored in floating-point register fd . the source and destination operands are processed as floating-point format fmt . the cc bit of fcr31 and fcr25 is set by a floating-point comparison instruction (c.cond.fmt). tf specifies which is used as the branch condition, true or false. the value of tf is fixed for each instruction. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this instruction is non-arithmetically executed and the ieee754 exception does not occur. operation: 32, 64 t: if fpconditioncode(cc) = 1 then storefpr (fd, fmt, valuefpr (fs, fmt)) else storefpr (fd, fmt, valuefpr (fd, fmt)) endif exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 585 movz.fmt floating-point move conditional on zero 26 31 cop1 010001 0 fd 25 rt 21 20 16 15 11 10 6 5 movz 010010 fmt fs format: movz.s fd, fs, rt mips iv movz.d fd, fs, rt purpose: tests the value of a general-purpose register and conditionally moves a floating-point value. description: if the contents of cpu general-purpose register rt are 0, this instruction stores the contents of floating-point register fs in floating-point register fd . the source and destination operands are processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. this instruction is non-arithmetically executed and the ieee754 exception does not occur. operation: 32, 64 t: if gpr[rt] = 0 then storefpr (fd, fmt, valuefpr (fs, fmt)) else storefpr (fd, fmt, valuefpr (fd, fmt)) endif exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 586 msub.fmt floating-point multiply-subtract 26 31 cop1x 010011 0 fs 25 ft 21 20 16 15 11 10 6 5 msub 101 fr fd fmt 32 format: msub.s fd, fr, fs, ft mips iv msub.d fd, fr, fs, ft purpose: combines multiplication and subtraction of floating-point values for execution. description: this instruction multiplies the contents of floating-point register fs by the contents of floating-point register ft , subtracts the contents of floating-point register fr from the result, and stores the result of the subtraction in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the condition of an exception is detected but the exception does not occur, the cause bit and flag bit of a floating-point control register are ored and the result is written to the flag bit. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt) * valuefpr (ft, fmt) ? valuefpr (fr, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception caution if the result of multiplication is a denormalized number, or an underflow or overflow occurs, an unimplemented operation exception actually occurs.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 587 mtc1 move word to fpu (coprocessor 1) 26 31 cop1 0 25 mt 21 20 16 15 11 10 0 fs rt 010001 00100 00000000000 format: mtc1 rt, fs mips i purpose: copies a word from a general-purpose register to an fpu (cp1) general-purpose register. description: this instruction stores the contents of cpu general-purpose register rt in floating-point general-purpose register fs . how the floating-point general-purpose register is accessed differs depending on the setting of the fr bit of the status register. if the fr bit is 0, all the 32 floating-point general-purpose registers can be accessed. to transfer double- precision data, access an odd register for the higher 32 bits and an even register for the lower 32 bits, depending on the format of the floating-point operation instruction. if the fr bit is 1, all the 32 floating-point general-purpose registers can be accessed, but the lower 32 bits of the registers are accessed for data. operation: 32, 64 t: data gpr [rt] 31..0 t + 1: if sr 26 = 1 then fgr [fs] undefined 32 || data else fgr [fs] data endif exceptions: coprocessor unusable exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 588 mul.fmt floating-point multiply 26 31 cop1 010001 0 fd 25 ft 21 20 16 15 11 10 6 5 mul 000010 fmt fs format: mul.s fd, fs, ft mips i mul.d fd, fs, ft purpose: multiplies floating-point values. description: this instruction multiplies the contents of floating-point register fs by the contents of floating-point register ft , and stores the result in floating-point register fd . the operand is processed as floating-point format fmt . this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt)* valuefpr (ft, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 589 neg.fmt floating-point negate 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 neg fs fmt 010001 00000 000111 format: neg.s fd, fs mips i neg.d fd, fs purpose: executes a negation operation of a floating-point value. description: this instruction inverts the sign of the contents of floating-point register fs and stores the result in floating-point register fd . the operand is processed as floating-point format fmt . the sign is arithmetically inverted. therefore, an instruction whose operand is nan is invalid. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, negate (valuefpr (fs, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 590 nmadd.fmt floating-point negate multiply-add 26 31 cop1x 010011 0 fs 25 ft 21 20 16 15 11 10 6 5 nmadd 110 fr fd fmt 32 format: nmadd.s fd, fr, fs, ft mips iv nmadd.d fd, fr, fs, ft purpose: combines multiplication and addition of floating-point values for execution and executes a negation operation on the results. description: this instruction multiplies the contents of floating-point register fs by the contents of floating-point register ft , inverts the sign of the result, and stores the result in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . the sign is arithmetically inverted. therefore, an instruction whose operand is nan is invalid. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the condition of an exception is detected but the exception does not occur, the cause bit and flag bit of a floating-point control register are ored and the result is written to the flag bit. operation: 32, 64 t: storefpr (fd, fmt, negate (valuefpr (fr, fmt) + valuefpr (fs, fmt) * valuefpr (ft, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception caution if the result of multiplication is a denormalized number, or an underflow or overflow occurs, an unimplemented operation exception actually occurs.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 591 nmsub.fmt floating-point negate multiply-subtract 26 31 cop1x 010011 0 fs 25 ft 21 20 16 15 11 10 6 5 nmsub 111 fr fd fmt 32 format: nmsub.s fd, fr, fs, ft mips iv nmsub.d fd, fr, fs, ft purpose: combines multiplication and subtraction of floating-point values for execution and executes a negation operation on the results. description: this instruction multiplies the contents of floating-point register fs by the contents of floating-point register ft , subtracts the contents of floating-point register fr , inverts the sign of the result, and stores the result in floating- point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . the sign is arithmetically inverted. therefore, an instruction whose operand is nan is invalid. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the condition of an exception is detected but the exception does not occur, the cause bit and flag bit of a floating-point control register are ored and the result is written to the flag bit. operation: 32, 64 t: storefpr (fd, fmt, negate (valuefpr (fs, fmt) * valuefpr (ft, fmt) ? valuefpr (fr, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception caution if the result of multiplication is a denormalized number, or an underflow or overflow occurs, an unimplemented operation exception actually occurs.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 592 prefx prefetch indexed (1/2) 26 31 cop1x 010011 0 hint 25 index 21 20 16 15 11 10 6 5 prefx 001111 0 00000 base format: prefx hint, index (base) mips iv purpose: prefetches data from memory (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register base and the contents of cpu general- purpose register index to generate a virtual address. it then loads the contents at the specified address position to the data cache. bits 15 to 11 ( hint ) of this instruction indicate how the loaded data is used. note, however, that the contents of hint are only used for the processor to judge if prefetching by this instruction is valid or not, and do not affect the actual operation. hint indicates the following operations. hint operation description 0 load predicts that data is loaded (without modification). fetches data as if it were loaded. 1 to 31 ? reserved this is an auxiliary instruction that improves the program performance. the generated address or the contents of hint do not change the status of the processor or system, or the meaning (purpose) of the program. if this instruction causes a memory access to occur, the access type to be used is determined by the generated address. in other words, the access type used to load/store the generated address is also used for this instruction. however, an access to an uncached area does not occur. if a translation entry to the specified memory position is not in the tlb, data cannot be prefetched from the map area. this is because no translation entry exists in tlb, it means that no access was made to the memory position recently, therefore, no effect can be expected even if data at such a memory position is prefetched. exceptions related to addressing do not occur as a result of executing this instruction. if the condition of an exception is detected, it is ignored, but the prefetch is not executed either. however, even if nothing is prefetched, processing that does not appear, such as writing back a dirty cache line, may be performed. the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base .
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 593 prefx prefetch indexed (2/2) operation: 32, 64 t: vaddr gpr[base] + gpr[index] (paddr, cca) addresstranslation (vaddr, data, load) prefetch (cca, paddr, vaddr, data, hint) exceptions: coprocessor unusable exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 594 recip.fmt reciprocal 26 31 cop1 010001 0 fs 25 fd 21 20 16 15 11 10 6 5 recip 010101 0 00000 fmt format: recip.s fd, fs mips iv recip.d fd, fs purpose: calculates the approximate value of the reciprocal of a floating-point value (high speed). description: this instruction calculates the reciprocal of the contents of floating-point register fs and stores the result in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, 1.0 / valuefpr (fs, fmt)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception division-by-zero exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 595 round.l.fmt floating-point round to long fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 round. l fs fmt 010001 00000 001000 format: round.l.s fd, fs mips iii round.l.d fd, fs purpose: converts a floating-point value into a 64-bit fixed-point value rounded to the closest value. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit fixed-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded to the closest value or an even number regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 63 ? 1 to ? 2 63 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 63 ? 1 is returned. this operation is defined in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: storefpr (fd, l, convertfmt (valuefpr (fs, fmt), fmt, l)) remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception floating-point operation exception reserved instruction exception (32-bit user/supervisor mode) floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 596 round.l.fmt floating-point round to long fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ?2 53 (0xffe0 0000 0000 0000).
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 597 round.w.fmt floating-point round to single fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 round. w fs fmt 010001 00000 001100 format: round.w.s fd, fs mips ii round.w.d fd, fs purpose: converts a floating-point value into a 32-bit fixed-point value rounded to the closest value. description: this instruction arithmetically converts the contents of floating-point register fs into a 32-bit fixed-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded to the closest value or an even number regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 31 ? 1 to ? 2 31 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 31 ? 1 is returned. operation: 32, 64 t: storefpr (fd, w, convertfmt (valuefpr (fs, fmt), fmt, w)) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 598 round.w.fmt floating-point round to single fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 31 ? 1 (0x7fff ffff) to ? 2 31 (0x8000 0000).
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 599 rsqrt.fmt reciprocal square root 26 31 cop1 010001 0 fs 25 fd 21 20 16 15 11 10 6 5 rsqrt 010110 0 00000 fmt format: rsqrt.s fd, fs mips iv rsqrt.d fd, fs purpose: calculates the approximate value of the reciprocal of the square root of a floating-point value (high speed). description: this instruction calculates the positive arithmetic square root of the contents of floating-point register fs , inverts the result, and stores the result in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the operand is processed as floating-point format fmt . if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, 1.0 / squareroot (valuefpr (fs, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception division-by-zero exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 600 sdc1 store doubleword from fpu (coprocessor 1) (1/2) 26 31 sdc1 0 25 21 20 16 15 ft 111101 offset base format: sdc1 ft, offset (base) mips ii purpose: stores a doubleword from a floating-point register to memory. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. if the fr bit of the status register is 0, this instruction stores the contents of floating-point registers ft and ft + 1 in the memory specified by the virtual address as a doubleword. at this time, the contents of the odd-numbered register specified by ft + 1 correspond to the higher 32 bits of the doubleword, and the contents of the even- numbered register specified by ft correspond to the lower 32 bits. the operation is undefined if the least significant bit of the ft field is not 0. if the fr bit is 1, the contents of floating-point register ft are stored in the memory specified by the virtual address as a doubleword. if the lower 3 bits of the address are not 0, an address error exception occurs.
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 601 sdc1 store doubleword from fpu (coprocessor 1) (2/2) operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) if sr 26 = 1 then data fgr [ft] 63..0 elseif ft 0 = 0 then data fgr [ft + 1] 31..0 || fgr [ft] 31..0 else data undefined 64 endif storememory (uncached, doubleword, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) if sr 26 = 1 then data fgr [ft] 63..0 elseif ft 0 = 0 then data fgr [ft + 1] 31..0 || fgr [ft] 31..0 else data undefined 64 endif storememory (uncached, doubleword, data, paddr, vaddr, data) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 602 sdxc1 store doubleword indexed to fpu (coprocessor 1) 26 31 cop1x 010011 0 fs 25 index 21 20 16 15 11 10 6 5 sdxc1 001001 0 00000 base format: sdxc1 fs, index (base) mips iv purpose: stores a doubleword from a floating-point register to memory (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. if the fr bit of the status register is 0, this instruction stores the contents of floating-point registers fs and fs + 1 in the memory specified by the virtual address as a doubleword. at this time, the contents of the odd-numbered register specified by fs + 1 correspond to the higher 32 bits of the doubleword, and the contents of the even- numbered register specified by fs correspond to the lower 32 bits. the operation is undefined if the least significant bit of the fs field is not 0. if the fr bit is 1, the contents of floating-point register fs are stored in the memory specified by the virtual address as a doubleword. the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . an address error exception occurs if the lower 3 bits of the virtual address are not 0. operation: 32, 64 t: vaddr gpr[base] + gpr[index] (paddr, cca) address translation (vaddr, data) if sr 26 = 1 then data fgr[fs] elseif fs 0 = 0 then data fgr[fs + 1] || fgr[fs] else data undefined 64 endif storememory (cca, doubleword, data, paddr, vaddr, data) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 603 sqrt.fmt floating-point square root 26 31 cop1 010001 0 fs 25 fd 21 20 16 15 11 10 6 5 sqrt 000100 0 00000 fmt format: sqrt.s fd, fs mips ii sqrt.d fd, fs purpose: calculates the square root of a floating-point value. description: this instruction calculates the positive arithmetic square root of the contents of floating-point register fs and stores the result in floating-point register fd . the operand is processed as floating-point format fmt . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. the result is ?0 if the value of the source operand is ?0. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, squareroot (valuefpr (fs, fmt))) exceptions: coprocessor unusable exception reserved instruction exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 604 sub.fmt floating-point subtract 26 31 cop1 010001 0 fd 25 ft 21 20 16 15 11 10 6 5 sub 000001 fmt fs format: sub.s fd, fs, ft mips i sub.d fd, fs, ft purpose: subtracts a floating-point value. description: this instruction subtracts the contents of floating-point register ft from the contents of floating-point register fs , and stores the result in floating-point register fd . the operation is executed as if it were of infinite accuracy, and the result is rounded in accordance with the current rounding mode. this instruction is valid only in single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. operation: 32, 64 t: storefpr (fd, fmt, valuefpr (fs, fmt) ? valuefpr (ft, fmt)) exceptions: coprocessor unusable exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception underflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 605 suxc1 store doubleword indexed unaligned to fpu (coprocessor 1) (1/2) 26 31 cop1x 010011 0 fs 25 index 21 20 16 15 11 10 6 5 suxc1 001101 0 00000 base format: suxc1 fs, index (base) mips v purpose: stores a doubleword from a floating-point register to memory (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. the lower 3 bits of the virtual address are masked by 0. therefore, an address error exception does not occur even if the lower 3 bits of the virtual address are not 0. if the fr bit of the status register is 0, this instruction stores the contents of floating-point registers fs and fs + 1 in the memory specified by the virtual address as a doubleword. at this time, the contents of the odd-numbered register specified by fs + 1 correspond to the higher 32 bits of the doubleword, and the contents of the even- numbered register specified by fs correspond to the lower 32 bits. the operation is undefined if the least significant bit of the fs field is not 0. the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . operation: 32, 64 t: vaddr (gpr[base] + gpr[index]) 63..3 || 0 3 (paddr, cca) address translation (vaddr, data) if sr 26 = 1 then data fgr[fs] elseif fs 0 = 0 then data fgr[fs + 1] || fgr[fs] else data undefined 64 endif storememory (cca, doubleword, data, paddr, vaddr, data)
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 606 suxc1 store doubleword indexed unaligned to fpu (coprocessor 1) (2/2) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 607 swc1 store word from fpu (coprocessor 1) 26 31 swc1 0 25 21 20 16 15 ft 111001 offset base format: swc1 ft, offset (base) mips i purpose: stores a word from a floating-point register to memory. description: this instruction sign-extends a 16-bit offset and adds the result to the contents of general-purpose register base to generate a virtual address. the contents of floating-point general-purpose register ft are stored in the memory at the specified address. if the fr bit of the status register is 0 and if the least significant bit of the ft field is 0, the contents of the lower 32 bits of floating-point register ft are stored. if the least significant bit of the ft field is 1, the contents of the higher 32 bits of floating-point register ft ? 1 are stored. if the fr bit is 1, all the 64-bit floating-point registers can be accessed. therefore, the contents of the lower 32 bits of the ft field are stored. if the lower 2 bits of the address are not 0, an address error exception occurs. operation: 32 t: vaddr ((offset 15 ) 16 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data fgr [ft] 31..0 storememory (uncached, word, data, paddr, vaddr, data) 64 t: vaddr ((offset 15 ) 48 || offset 15..0 ) + gpr [base] (paddr, uncached) address translation (vaddr, data) data fgr [ft] 31..0 storememory (uncached, word, data, paddr, vaddr, data) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception bus error exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 608 swxc1 store word indexed from fpu (coprocessor 1) 26 31 cop1x 010011 0 fs 25 index 21 20 16 15 11 10 6 5 swxc1 001000 0 00000 base format: swxc1 fs, index (base) mips iv purpose: stores a word from a floating-point register to memory (general-purpose register + general-purpose register addressing). description: this instruction adds the contents of cpu general-purpose register index and the contents of cpu general- purpose register base to generate a virtual address. the contents of floating-point register fs are stored in the memory specified by the virtual address. if the fr bit of the status register is 0 and if the least significant bit of the fs field is 0, the contents of the lower 32 bits of floating-point register fs are stored. if the least significant bit of the fs field is 1, the contents of the higher 32 bits of floating-point register fs ? 1 are stored. if the fr bit is 1, the contents of floating-point register fs are stored in the memory specified by the virtual address. the operation is undefined if bits 63 and 62 of the virtual address are not the same as bits 63 and 62 of general- purpose register base . if the lower 2 bits of the virtual address are not 0, an address error exception occurs. operation: 32, 64 t: vaddr gpr[base] + gpr[index] (paddr, cca) address translation (vaddr, data) if sr 26 = 1 then data data 63..32 || fgr[fs] 31..0 elseif fs 0 = 0 then data data 63..32 || fgr[fd] 31..0 else data fgr[fd ? 1] 63..32 || data 31..0 endif storememory (cca, word, data, paddr, vaddr, data) exceptions: coprocessor unusable exception tlb refill exception tlb invalid exception tlb modified exception address error exception reserved instruction exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 609 trunc.l.fmt floating-point truncate to long fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 trunc. l fs fmt 010001 00000 001001 format: trunc.l.s fd, fs mips iii trunc.l.d fd, fs purpose: converts a floating-point value into a 64-bit fixed-point value rounded to the direction of zero. description: this instruction arithmetically converts the contents of floating-point register fs into a 64-bit fixed-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of zero regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 63 ? 1 to ? 2 63 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 63 ? 1 is returned. this operation is defined in 64-bit mode or in 32-bit kernel mode. execution of this instruction in 32-bit user or supervisor mode causes a reserved instruction exception. operation: 64 t: storefpr (fd, l, convertfmt (valuefpr (fs, fmt), fmt, l)) remark the operation is the same in the 32-bit kernel mode. exceptions: coprocessor unusable exception floating-point operation exception reserved instruction exception (32-bit user/supervisor mode) floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 610 trunc.l.fmt floating-point trancate to long fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 53 ? 1 (0x001f ffff ffff ffff) to ?2 53 (0xffe0 0000 0000 0000).
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 611 trunc.w.fmt floating-point truncate to single fixed-point format (1/2) 26 31 cop1 0 fd 25 0 21 20 16 15 11 10 6 5 trunc. w fs fmt 010001 00000 001101 format: trunc.w.s fd, fs mips ii trunc.w.d fd, fs purpose: converts a floating-point value into a 32-bit fixed-point value rounded to the direction of zero. description: this instruction arithmetically converts the contents of floating-point register fs into a 32-bit fixed-point format, and stores the result in floating-point register fd . the source operand is processed as floating-point format fmt . the result is rounded toward the direction of zero regardless of the current rounding mode. this instruction is valid only when converting from single-/double-precision floating-point formats. if the fr bit of the status register is 0, only an even register number can be specified because a pair of even and odd numbers adjoining each other is used as the register number of a floating-point register. if an odd number is specified, the operation is undefined. if the fr bit of the status register is 1, both odd and even register numbers are valid. if the source operand is infinity or nan, and if the result of rounding is outside the range of 2 31 ? 1 to ? 2 31 , the flag bits of fcr31 and fcr26 are set to indicate an invalid operation. if an invalid operation exception is not enabled, the exception does not occur, and 2 31 ? 1 is returned. operation: 32, 64 t: storefpr (fd, w, convertfmt (valuefpr (fs, fmt), fmt, w)) exceptions: coprocessor unusable exception floating-point operation exception floating-point operation exceptions: unimplemented operation exception invalid operation exception inexact operation exception overflow exception
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 612 trunc.w.fmt floating-point truncate to single fixed-point format (2/2) caution the unimplemented operation exception occurs in the following cases. ? ? ? ? if overflow occurs when the format is converted into a fixed-point format ? ? ? ? if the source operand is infinity ? ? ? ? if the source operand is nan specifically, the exception occurs if the value stored in floating-point register fd is outside the range of 2 31 ? 1 (0x7fff ffff) to ? 2 31 (0x8000 0000).
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 613 18.5 fpu instruction opcode bit encoding figure 18-3 lists the v r 5500 instruction opcode bit encoding. figure 18-3. fpu instruction opcode bit encoding (1/2) 28...26 opcode 31...29 0 1 2 3 4 5 6 7 0 special 1 2 cop1 cop1x 3 4 5 6 lwc1 ldc1 7 swc1 sdc1 23...21 sub 25?2401234567 0mfdmf cf mt dmt ct 1bc ?????? 2s d ? wl ? 3 ??????? 18...16 br 20...19 0 1 2 3 4 5 6 7 0 bcf bct bcfl bctl * * * * 1******** 2******** 3******** 2...0 special function 5...3 0 1 2 3 4 5 6 7 0*movf/movt****** 1******** 2******** 3******** 4******** 5******** 6******** 7********
chapter 18 fpu instruction set preliminary user?s manual u16044ej1v0um 614 figure 18-3. fpu instruction opcode bit encoding (2/2) 2...0 cop1 function 5...3 0 1 2 3 4 5 6 7 0 add sub mul div sqrt abs mov neg 1 round.l trunc.l ceil.l floor.l round.w trunc.w ceil.w floor.w 2 ? movz movn recip rsqrt 3 ??????? 4cvt.scvt.d ? cvt.w cvt.l ? 5 ??????? 6 c.f c.un c.eq c.ueq c.olt c.ult c.ole c.ule 7 c.sf c.ngle c.seq c.ngl c.lt c.nge c.le c.ngt 2...0 cop1x function 5...3 0 1 2 3 4 5 6 7 0lwxc1ldxc1 ?? luxc1 ? 1 swxc1 sdxc1 ?? suxc1 prefx 2 ??????? 3 ??????? 4 madd.s madd.d ????? 5 msub.s msub.d ????? 6 nmadd.s nmadd.d ????? 7 nmsub.s nmsub.d ????? remark the meaning of the symbols in the above figures are as follows. *: execution of operation codes marked with an asterisk cause reserved instruction exceptions. they are reserved for future versions of the architecture. : execution of operation codes marked with a gamma cause an unimplemented operation instruction exception. they are reserved for future versions of the architecture. : if the operation code marked with an eta is executed, the result is valid only when the mips iii instruction set can be used. if the operation is executed when the instruction set cannot be used (32-bit user/supervisor mode), an unimplemented operation exception occurs.
preliminary user?s manual u16044ej1v0um 615 chapter 19 instruction hazards 19.1 overview depending on the combination of instructions, the result cannot be provided if two or more system events such as a cache miss, interrupt, and exception, occur during execution. do not use such instruction combinations. many hazards are caused by instructions that change the status or read data in different pipeline stages. these hazards are caused by a combination of instructions; no single instruction causes a hazard. other hazards occur when an instruction is re-executed after exception processing. 19.2 details of instruction hazard with the v r 5500, the hardware automatically avoids hazards, except those related to instruction fetch. the following table shows the combinations of operations and sources that cause hazards related to instruction fetch which make the operation unstable and prediction of the result impossible. table 19-1. instruction hazard of v r 5500 operation source number of hazards instruction fetch (during address translation) entryhi.asid, tlb note instruction fetch (during address error detection) status.ksu, status.exl, status.erl, status.kx, status.sx, status.ux note instruction decode (during detection of coprocessor and enable privileged instruction) status.xx, status.cu, status.ksu, status.exl, status.erl, status.kx, status.sx, status.ux 1 note if a change is made in the exception handler, it is accurately reflected after the eret instruction has been executed (compatible with mips64).
preliminary user?s manual u16044ej1v0um 616 chapter 20 pll passive elements connect some passive elements externally to the v dd pa1, v dd pa2, v ss pa1 and v ss pa2 pins for proper operation of the v r 5500. connect the passive elements as close as possible to each pin. figure 20-1 shows a connection diagram of the pll passive elements. figure 20-1. example of connection of pll passive elements v r 5500 l c3 c1 v dd v dd pa1 v ss pa1 c2 v ss v dd pa2 v ss pa2 it is essential to isolate the analog power supply (v dd pa1, v dd pa2) and ground (v ss pa1, v ss pa2) for the pll circuit from the regular power supply (v dd ) and ground (v ss ). examples of each passive element value are as follows. l = 10 h c1 = 0.1 f c2 = 100 pf c3 = 10 f since the optimum values for the filter elements depend on the application and the system noise environment, these values should be considered as starting points for further experimentation within your specific application.
preliminary user?s manual u16044ej1v0um 617 chapter 21 debugging and testing this chapter explains the debug and test functions of the v r 5500 when a debugging tool is used. the debug functions explained in this chapter have nothing to do with debugging using the watchlo and watchhi registers of the cp0, and realize more sophisticated debugging. the debugging tool is connected via a test interface. 21.1 overview if a debug break occurs, the processor transfers control to the debug exception vector, and enters the debug mode from the normal mode (normal operating status). in the debug mode, the resources of the processor are accessed and controlled internally or externally. test interfaces (jtag interface conforming to ieee1149.1 and debug intereface conforming to the n-wire specifications) are used to access the processor?s resources from an external device. (1) internal access this access is made by the program located at the debug exception vector, using debug instructions. of the resources of the processor, all the resources used in the normal mode (such as register files, caches, external memory, and external i/o) and debug registers can be accessed. (2) external access this access is made by the debugging tool externally connected via a test interface. all the resources of the processor (such as resources used in the normal mode, the debug registers, and the jtag registers) can be accessed.
chapter 21 debugging and testing preliminary user?s manual u16044ej1v0um 618 figure 21-1. access to processor resources in debug mode external debugging tool debug registers v r 5500 debug module jtag registers resources used in normal mode resources accessible internally resources accessible externally the debug registers can be accessed internally or externally only in the debug mode. these registers are used to set breakpoints and their statuses, and change the status of the processor. these registers can be accessed only by using debug instructions. the debug instructions are used to manipulate the debug registers, the resources used in the normal mode, execute debug break, and restore the normal mode. the externally accessed debug functions have been expanded by the n-wire specification debug interface. by using this interface, all the resources in the system, including the processor resources, can be monitored from an external debugging tool. for example, data can be loaded to the external memory in the debug mode, then the mode can be changed to the normal mode, and the result of the operation using this data can be monitored. the n- wire specification also allows an access to the jtag registers. because both the debug registers and jtag registers can be accessed externally, the scope of control of the processor can be expanded compared with internal access. note n of n-wire indicates the data bus width of the debug interface. because ntrcdata(3:0) specifies the bus width of the v r 5500, n = 4.
chapter 21 debugging and testing preliminary user ? s manual u16044ej1v0um 619 21.2 test interface signals table 21-1. test interface signals pin name i/o function recommended connection when not used jtck input jtag clock input serial clock input signal for jtag pull up jtms input jtag mode selection jtag test mode selection signal pull up jtdi input jtag data input serial data input for jtag pull up jtdo output jtag data output serial data output for jtag leave open jtrst# input jtag reset input signal for initializing jtag test module (only ver. 2.0 or later) pull down ntrcdata(3:0) output trace data data output of test interface leave open ntrcend output trace end signal indicating delimiting (end) of trace data packet leave open ntrcclk output trace clock clock for test interface. same clock as sysclock is output. leave open rmode#/bktgio# i/o reset mode/break trigger output debug reset input signal while jtrst# signal (coldreset# signal of ver. 1.x) is active. break or trigger i/o signal during normal operation pull up remark # indicates active low. (1) jtck (input) input a serial clock for jtag to this pin. the maximum operating frequency is 33 mhz. this clock can operate asynchronously to the system clock (sysclock). the jtdi and jtms signals are sampled at the rising edge of jtck. the status of the jtdo signal changes at the falling edge of jtck. (2) jtms (input) input a command, such as that for selecting mode, for controling the test operation of jtag. the input command is decoded by the tap (test access port) controller. when an external debugging tool is not connected, pull up this signal (this signal is not internally pulled up). (3) jtdi (input) input serial data for scanning to this pin. when an external debugging tool is not connected, pull up this signal (this signal is not internally pulled up). (4) jtdo (3-state output) this pin outputs scanned serial data. if the data is not correctly scanned, this pin goes into a high-impedance state as defined by ieee1149.1.
chapter 21 debugging and testing preliminary user ? s manual u16044ej1v0um 620 (5) jtrst# (input) input a low level to this pin to reset the debug module. this invalidates the debug functions. ? low level: initializes the debug module and invalidates the debug functions. ? high level: clears resetting of the debug module and validates the debug functions. remark because this signal is not provided in v r 5500 ver. 1.x, the function of this signal is implemented by the coldreset# signal. (6) ntrcdata(3:0) (output) these pins output a trace packet that is generated as a result of an operation of the processor. it takes one or more cycles to output data of one packet. (7) ntrcend (output) this signal is asserted when the last data of a trace packet is output to ntrcdata(3:0). (8) ntrcclk (output) this pin ouptuts a clock of the same frequency as sysclock. this clock can be used when a reference clock is necessary for processing trace information, etc. (9) rmode#/bktgio# (input/output) this pin functions as the rmode# signal while the jtrst# signal (coldreset# signal with ver. 1.x) is active, and as the bktgio# signal at other times. (a) rmode# (input) input a signal that sets a debug reset to this pin. this signal is sampled when the jtrst# signal (coldreset# signal with ver. 1.x) is deasserted. setting of a debug reset by the rmode# signal is reflected in a debug register. ? low level: executes a debug reset to the processor. actually, the contents of the reset implemented by asserting the rmode# signal are the same as those implemented by asserting the reset# signal. the reset bit of the debug register is set to 1. ? high level: does not execute a debug reset to the processor. (b) bktgio# (input/output) input a signal that requests generation of a debug break to this pin when it is set in the input mode. when this pin is set in the output mode, it outputs a signal that indicates occurrence of a debug trigger or the debug mode status of the processor. this pin is set in the input mode by default, but the mode can be changed later by setting of debug register. (i) in input mode input a low level to this pin for the duration of only one cycle to generate a debug break. the processor then enters the debug mode when possible. if the processor is already in the debug mode or if a request for occurrence of a debug break has already been made, inputting a low level to this pin is meaningless. ? low level: generates a debug break and places the processor in the debug mode. ? high level: leaves the processor in the normal mode.
chapter 21 debugging and testing preliminary user ? s manual u16044ej1v0um 621 (ii) in output mode the v r 5500 can report detection of a trigger event every 2 sysclock cycles at the fastest. all the trigger events that occur after a trigger was output by the previous bktgio# signal are combined into one and output. a trigger event that is not reported when the processor enters the debug mode will not be reported later. ? low level: indicates that a trigger event is detected inside the processor if the number of cycles is 1. if the number of cycles is 2, this signal indicates that the processor is in the debug mode. ? high level: indicates that the processor is in the normal mode. because the internal circuitry of the v r 5500 has superscalar structure and operates at a frequency higher than that of the system interface, a trigger event may occur much earlier than the bktgio# signal reports its occurrence. 21.3 boundary scan the boundary scan register, one of the jtag registers, is a 125-bit shift register and holds the status of all the pins of the v r 5500. the least significant bit (jsysaden) of this register is the jtag output enable bit. when this bit is set to 1, jtag output is enabled for all the outputs of the processor. figure 21-2. boundary scan register rfu jsysaden 124 0 123 1 the boundary scan register is scanned starting from the least significant bit. the sequence of scanning the register bits is shown below.
chapter 21 debugging and testing preliminary user ? s manual u16044ej1v0um 622 table 21-2. boundary scan sequence no. signal name no. signal name no. signal name no. signal name no. signal name 1 jsysaden 26 sysad11 51 sysad55 76 sysid0 101 syscmd7 2 drvcon 27 sysad43 52 sysad24 77 sysid1 102 syscmd8 3rfu (always input 0.) 28 sysad12 53 sysad56 78 sysid2 103 tintsel 4 sysad0 29 sysad44 54 sysad25 79 rfu (always input 0.) 104 int0# 5 sysad32 30 sysad13 55 sysad57 80 busmode 105 int1# 6 sysad1 31 sysad45 56 sysad26 81 validout# 106 int2# 7 sysad33 32 sysad14 57 sysad58 82 validin# 107 int3# 8 sysad2 33 sysad46 58 sysad27 83 rdrdy# 108 int4# 9 sysad34 34 sysad15 59 sysad59 84 wrrdy# 109 int5# 10 sysad3 35 sysad47 60 sysad28 85 extrqst# 110 bktgio# 11 sysad35 36 sysad16 61 sysad60 86 preq# 111 rfu (always input 1.) 12 sysad4 37 sysad48 62 sysad29 87 release# 112 nmi# 13 sysad36 38 sysad17 63 sysad61 88 reset# 113 rfu (always input 1.) 14 sysad5 39 sysad49 64 sysad30 89 coldreset# 114 bigendian 15 sysad37 40 sysad18 65 sysad62 90 rfu (always input 0.) 115 divmode0 16 sysad6 41 sysad50 66 sysad31 91 o3return# 116 divmode1 17 sysad38 42 sysad19 67 sysad63 92 dwbtrans# 117 divmode2 18 sysad7 43 sysad51 68 sysadc0 93 disdvalido# 118 rfu (always input 1.) 19 sysad39 44 sysad20 69 sysadc4 94 syscmd0 119 ntrcclk 20 sysad8 45 sysad52 70 sysadc1 95 syscmd1 120 ntrcdata0 21 sysad40 46 sysad21 71 sysadc5 96 syscmd2 121 ntrcdata1 22 sysad9 47 sysad53 72 sysadc2 97 syscmd3 122 ntrcdata2 23 sysad41 48 sysad22 73 sysadc6 98 syscmd4 123 ntrcdata3 24 sysad10 49 sysad54 74 sysadc3 99 syscmd5 124 ntrcend 25 sysad42 50 sysad23 75 sysadc7 100 syscmd6 125 rfu (always input 1.) remark # indicates active low.
chapter 21 debugging and testing preliminary user?s manual u16044ej1v0um 623 21.4 connecting debugging tool to use the debug functions of the v r 5500, a circuit for connecting an external debugging tool is necessary on the target board. this section explains the circuit connection when the kyoto microcomputer in-circuit emulator partner-et ii is used as the debugging tool. caution when evaluating connection of an in-circuit emulator with a trace clock of 100 mhz or more, consult nec before designing the board. the frequency of the trace clock (ntrcclk) of the v r 5500 is the same as that of sysclock. 21.4.1 connecting in-circuit emulator and target board use of the following kell?s connectors is recommended when using the partner-et ii. ? 8830e-026-170s (26-pin, straight-angle type) ? 8830e-026-170l (26-pin, light-angle type) the pins of these recommended connectors are laid out as follows. figure 21-3. ie connection connector pin layout (a) signal layout on connector side of target board a1 a2 a12 a13 a3 b1 b2 b12 b13 b3 index mark side remark the dotted line indicates the approximate outline of the connector. (b) connector appearance index mark a1 pin
chapter 21 debugging and testing preliminary user ? s manual u16044ej1v0um 624 allocate functions to the pins of the recommended connectors as follows when using the partner-et ii. table 21-3. ie connector pin functions pin no. signal name i/o on ie connection side function a1 trcclk o trace clock output a2 trcdata0 o trace data 0 output a3 trcdata1 o trace data 1 output a4 trcdata2 o trace data 2 output a5 trcdata3 o trace data 3 output a6 trcend o trace data end output a7 ddi i data input for debug serial interface a8 dck i clock input for debug serial interface a9 dms i transfer mode select input for debug serial interface a10 ddo o data output for debug serial interface a11 drst( ? ) i debug control unit reset input (active low) a12 port0 o general-purpose control signal output 0 (3-state output) a13 port1 o general-purpose control signal output 1 (3-state output) b1 gnd ? ground potential b2 gnd ? ground potential b3 gnd ? ground potential b4 gnd ? ground potential b5 gnd ? ground potential b6 gnd ? ground potential b7 gnd ? ground potential b8 gnd ? ground potential b9 gnd ? ground potential b10 gnd ? ground potential b11 reserved ? leave this pin open. b12 reserved ? leave this pin open. b13 vdd ? 3.3 v (for monitoring target power application)
chapter 21 debugging and testing preliminary user?s manual u16044ej1v0um 625 21.4.2 connection circuit example the figure below shows an example of the connection circuit when the kell?s connector 8830e-026-170s is used. figure 21-4. debugging tool connection circuit example (when trace function is used) jtdo 8830e-026-170s v r 5500 rmode#/bktgio# jtrst# note 3 ntrcdata1 ntrcdata2 ntrcdata3 ntrcend ntrcclk ntrcdata0 jtck jtdi jtms trcdata1 trcdata2 trcdata3 trcend trcclk trcdata0 ddo drst( ? ) dck ddi dms gnd port0 vdd note 1 note 2 note 2 external event detector note 5 , logic analyzer, etc. 3.3 v 3.3 v note 2 note 1 note 2 note 2 note 2 note 2 note 2 22 ? 4.7 k ? 4.7 k ? 4.7 k ? 4.7 k ? 4.7 k ? 50 k ? 22 ? 22 ? 22 ? 22 ? 22 ? note 3 note 3 note 3 note 3 note 3 note 3 port1 note 5 22 ? note 2 note 4 notes 1. keep the clock pattern length as short as possible, and shield the pattern by enclosing it with gnd. keep the pattern length to within 100 mm. 2. keep the pattern length as short as possible; at least within 100 mm. 3. use a 3.3 v buffer. 4. use a clock buffer. 5. when using the bktgio function as a debug interrupt input from an external event detector, use port1 as a three-state control signal of the detection output signal of the external event detector (control the detection output signal so that it goes into a high-impedance state when port1 is high). caution directly connect the jtdo pin only to the in-circuit emulator. if the jtdo pin is connected as the boundary scan of the next stage, the system may hang up. remark vdd of the connector (b13) is used only to detect power application to the target board. however, it may be used as power source for a signal driver, such as dck, depending on the tool used. directly connect it to the power supply of the target board.
preliminary user?s manual u16044ej1v0um 626 appendix a sub-block order a block of data elements (byte, halfword, word, or doubleword) can be extracted from the memory by two methods: sequential ordering and sub-block ordering. this appendix explains these methods, with an emphasis placed on sub-block ordering. the minimum data element of block transfer of the v r 5500 differs depending on the bus width of the system interface. in the 64-bit bus mode, doubleword is the minimum unit. in the 32-bit bus mode, word is the minimum unit. in this appendix, the minimum data element is indicated as d. (1) sequential ordering with sequential ordering, the data elements of a block are extracted serially, i.e., sequentially. figure a-1 illustrates the sequential order. in this example, d0 is extracted first, and d3 last. figure a-1. extracting data blocks in sequential order d0 extracted first extracted second extracted third extracted fourth d1 d2 d3
appendix a sub-block order preliminary user ? s manual u16044ej1v0um 627 (2) sub-block ordering with sub-block ordering, the sequence in which the data elements are to be extracted can be defined. figure a-2 shows the sequence in which a data block consisting of four elements is extracted. in this example, d2 is extracted first. figure a-2. extracting data in sub-block order d0 sequence of extraction extracted third extracted fourth extracted first extracted second d1 d2 d3 2301 the sub-block ordering circuit generates this address by xoring each bit of the start block address with the output of a binary counter that is incremented starting from d0 (00 2 ) each time a data element has been extracted. tables a-1 to a-3 show sub-block ordering in which data is extracted from a block of four elements using this method, where the start block address is 10 2 , 11 2 , and 01 2 , respectively. to generate sub-block ordering, the address of a sub-block (10, 11, or 01) is xored with the binary count (00 2 to 11 2 ) of a doubleword. for example, to identify the element that is extracted the third from a data block with a start address of 10 2 , xor address 10 2 with binary count 10 2 . the result is 00 2 , i.e., d0.
appendix a sub-block order preliminary user ? s manual u16044ej1v0um 628 table a-1. transfer sequence by sub-block ordering: where start address is 10 2 cycle start block address binary count extracted element 110 00 10 210 01 11 310 10 00 410 11 01 table a-2. transfer sequence by sub-block ordering: where start address is 11 2 cycle start block address binary count extracted element 111 00 11 211 01 10 311 10 01 411 11 00 table a-3. transfer sequence by sub-block ordering: where start address is 01 2 cycle start block address binary count extracted element 101 00 01 201 01 00 301 10 11 401 11 10
preliminary user?s manual u16044ej1v0um 629 appendix b recommended power supply circuit figure b-1 shows an example of the connection of a power supply circuit. this figure is for reference only. for mass production, thoroughly evaluate and select each element (such as capacitors and regulators). figure b-1. example of recommended power supply circuit connection v r 5500 v dd io v dd (1.5 v) + 100 ? vlv (3.3 v) lt1085cm/ct 20 ? 0.1 f x 10 to 20 100 f / 25 v 0.1 f 0.1 f + 100 f/ 25 v + vcc (5 v) gnd lt1085ct-3.3 v 0.1 f x 20 100 f/ 25 v 0.1 f 0.1 f + 100 f/ 25 v
preliminary user?s manual u16044ej1v0um 630 appendix c restrictions on v r 5500 this appendix explains the restrictions on the v r 5500 and action to be taken. c.1 restrictions on ver.1.x c.1.1 during normal operation (1) return address in case of address error exception with v r 5500 ver. 1.x, when the return address (contents of the epc register) to which execution is to return from an exception handler by executing the eret instruction is in the address error area, a value different from the contents of the program counter (pc + 0x04 or pc + 0x08) is stored in the epc register if an interrupt occurs immediately after execution of the eret instruction. therefore, detect an address error and stop program execution in the exception handler. this restriction does not apply to ver. 2.0 or later. (2) uncached accelerated store operation with v r 5500 ver. 1.x, a store operation to the uncached accelerated area is not performed correctly if the system interface is in the 32-bit mode. this restriction does not apply to ver. 2.0 or later. (3) instruction fetch in uncached area with v r 5500 ver. 1.x, when an instruction is fetched from the uncached area while the system interface is in the 64-bit bus mode, the subsequent instruction may not be correctly executed depending on the combination of the instructions of an even address (lower word) and an odd address (higher word) (mainly combination of jump/branch instructions). remark the v r 5500 fetches instructions from the uncached area in word (32-bit) units. therefore, of the data output to the 64-bit bus, the instruction in the lower word (even address) is fetched, and the instruction in the higher word (odd address) is not used. therefore, make sure that the instruction at the odd address is not a jump/branch instruction, or that the code of the instruction at the even address is identical to that at the odd address. this restriction does not apply to ver. 2.0 or later. the combinations in which the instruction is not correctly executed are shown below. (a) branch instruction at even address with condition satisfied and branch instruction at odd address the branch destination of the branch instruction at the even address is calculated by using the offset (lower 16 bits) of the branch instruction at the odd address. there is no problem if the offset of the branch instruction at the odd address is the same as that at the even address. (b) j or jal instruction at even address and branch instruction at odd address the jump destination of the jump instruction at the even address is calculated by using the code (lower 26 bits) of the branch instruction at the odd address.
appendix c restrictions on v r 5500 preliminary user?s manual u16044ej1v0um 631 (c) j or jal instruction at even address and j or jal instruction at odd address the jump destination of the jump instruction at the even address is calculated by using the code (lower 26 bits) of the jump instruction at the odd address. there is no problem if the code (lower 26 bits) of the jump instruction at the odd address is the same as that at the even address. (d) branch instruction at even address with condition satisfied and j or jal instruction at odd address the branch destination of the branch instruction at the even address is calculated by using the code (lower 16 bits) of the jump instruction at the odd address. (4) operation in low-power mode v r 5500 ver. 1.x does not stop the internal pipeline clock even when the wait instruction is executed (the power consumption is not reduced). this restriction does not apply to ver. 2.0 or later. (5) clock output on clearing reset with v r 5500 ver. 1.x, the clock for the serial interface may not be output if a multiplication rate of 2, 3.5, 4, 4.5, or 5.5 is selected when generating an internal clock from an external clock. therefore, select a multiplication rate of 2.5, 3, or 5. this restriction does not apply to ver. 2.1 or later. c.1.2 when debug function is used caution the operation or result produced by the restrictions described below differs depending on the external debugging tool connected. for details when using the debug function of the v r 5500, therefore, consult the manufacturer of the debugging tool to be used. (1) trace data when jr/jalr instruction is executed with v r 5500 ver. 1.x, the contents of the internal tpc packet changes before a tpc packet that indicates the jump destination address of the first jump instruction is output when two or more jr or jalr instructions are executed within 16 pclocks. consequently, the wrong contents are output as the first tpc packet. this restriction does not apply to ver. 2.0 or later. (2) trace data when branch instruction is executed with v r 5500 ver. 1.x, contents that indicate that a branch has been satisfied two times are output as an nseq packet if a branch instruction that satisfies a branch and a branch instruction that does not satisfy a branch are executed consecutively. this restriction does not apply to ver. 2.0 or later. (3) trace data when exception occurs with v r 5500 ver. 1.x, a tpc packet or nseq packet is output instead of an exp packet, which indicates occurrence of an exception, if an exception occurs as a result of executing the instruction in the branch delay slot. this restriction does not apply to ver. 2.0 or later.
appendix c restrictions on v r 5500 preliminary user?s manual u16044ej1v0um 632 (4) trace data when exl bit = 1 with v r 5500 ver. 1.x, a packet indicating occurrence of a tlb exception is output instead of a packet indicating occurrence of an ordinary exception if a tlb exception occurs while the exl bit is set to 1. this restriction does not apply to ver. 2.0 or later. (5) operation of bktgio# signal with v r 5500 ver. 1.x, an event trigger is output from the bktgio# pin if an instruction cache miss conflicts with the match of an instruction address when match of an instruction address is specified as a break trigger. this restriction does not apply to ver. 2.0 or later. (6) operation when instruction address break occurs with v r 5500 ver. 1.x, the processor deadlocks if an interrupt or exception conflicts with an instruction address match when an instruction address match is specified as a break trigger. this restriction does not apply to ver. 2.0 or later. (7) setting of mask register for read access with v r 5500 ver. 1.x, a break does not occur if the mask register is set taking endian into consideration when a data data trap for read access is set. when setting a data data trap for read access, therefore, set the mask register without taking endian into consideration. this restriction does not apply to ver. 2.0 or later. (8) debug reset signal v r 5500 ver. 1.x does not have a dedicated signal to execute a debug reset and uses the coldreset# signal instead. however, the coldreset# signal may be asserted during boundary scan, and therefore, an error may occur during boundary scan. this restriction does not apply to ver. 2.0 or later because a dedicated jtrst# signal has been added. (9) trace output in debug mode v r 5500 ver. 1.x ouptuts trace data even in the debug mode. therefore, ignore the data output from the ntrcdata(3:0) pins from when a debug exception packet is output to when a dret packet is output. this restriction does not affect the data output from the ntrcdata(3:0) pins in the debug mode because it is ignored by the in-circuit emulator. this restriction does not apply to ver. 2.0 or later.
appendix c restrictions on v r 5500 preliminary user?s manual u16044ej1v0um 633 c.2 restrictions on ver. 2.0 c.2.1 during normal operation (1) clock output on clearing reset with v r 5500 ver. 2.0, the clock for the serial interface may not be output if a multiplication rate of 2, 3.5, 4, 4.5, or 5.5 is selected when generating an internal clock from an external clock. therefore, select a multiplication rate of 2.5, 3, or 5. this restriction does not apply to ver. 2.1 or later. (2) operation of release# signal in out-of-order return mode the release# signal is not asserted (low level) and the right to control the system interface is not released to the external agent even if the rdrdy# signal is deasserted (high level) in the cycle in which the first request of the successive read requests is issued when v r 5500 ver. 2.0 is set in the pipeline mode in the out-of-order return mode. this restriction does not apply to ver. 2.1 or later. (3) return address in case of address error exception with v r 5500 ver. 2.0, if a jump/branch instruction is located two instructions before the boundary with the address error space and if a branch prediction miss (including ras miss), eret instruction commitment, exception (except the address error exception mentioned) does not occur (is not committed) between execution of the above jump/branch instruction and occurrence (commitment) of an address error exception due to a specific cause (refer below), the address stored in the badvaddr register by the processing of the above address error exception is the address at the position (boundary with the address space) two instructions after the jump/branch instruction. however, the correct address is stored in the epc register. therefore, do not locate a jump/branch instruction at the position two instructions before the boundary with the address space. this restriction applies to the following causes of the address error exception. ? if an attempt is made to fetch an instruction in the kernel address space in the user or supervisor mode ? if an attempt is made to fetch an instruction in the supervisor address space in the user mode ? if an attempt is made to fetch an instruction not located at the word boundary ? if an attempt is made to reference the address error space in the kernel mode
appendix c restrictions on v r 5500 preliminary user?s manual u16044ej1v0um 634 c.2.2 when using debug function caution the operation or result produced by the restrictions described below differs depending on the external debugging tool connected. for details when using the debug function of the v r 5500, therefore, consult the manufacturer of the debugging tool to be used. (1) initialization of debug registers v r 5500 ver. 2.0 initializes the monitor data register in the debug module when the reset# signal is asserted. however, because the reset# signal is masked on the emulator side, this restriction has no influence. (2) operation when break trigger and exception conflict with v r 5500 ver. 2.0, if the data break control register in the debug module is set so that only a trigger occurs and if a break trigger and an address error exception or tlb exception occur in the same load/store instruction, the address error exception or tlb exception is indicated by the cause code. however, 0xbfc0 1000 for the debug exception is used as the exception vector address. (3) masking nmi request with v r 5500 ver. 2.0, an nmi exception occurs even if occurrence of nmi is masked by the debug mode control register in the debug module when the nmi request is already held pending internally.
appendix c restrictions on v r 5500 preliminary user?s manual u16044ej1v0um 635 c.3 restrictions on ver. 2.1 or later c.3.1 during normal operation (1) return address in case of address error exception with v r 5500 ver. 2.1 or later, if a jump/branch instruction is located two instructions before the boundary with the address error space and if a branch prediction miss (including ras miss), eret instruction commitment, exception (except the address error exception mentioned) does not occur (is not committed) between execution of the above jump/branch instruction and occurrence (commitment) of an address error exception due to a specific cause (refer below), the address stored in the badvaddr register by the processing of the above address error exception is the address at the position (boundary with the address space) two instructions after the jump/branch instruction. however, the correct address is stored in the epc register. therefore, do not locate a jump/branch instruction at the position two instructions before the boundary with the address space. this restriction applies to the following causes of the address error exception. ? if an attempt is made to fetch an instruction in the kernel address space in the user or supervisor mode ? if an attempt is made to fetch an instruction in the supervisor address space in the user mode ? if an attempt is made to fetch an instruction not located at the word boundary ? if an attempt is made to reference the address error space in the kernel mode c.3.2 when using debug function caution the operation or result produced by the restrictions described below differs depending on the external debugging tool connected. for details when using the debug function of the v r 5500, therefore, consult the manufacturer of the debugging tool to be used. (1) initialization of debug registers v r 5500 ver. 2.1 or later initializes the monitor data register in the debug module when the reset# signal is asserted. however, because the reset# signal is masked on the emulator side, this restriction has no influence. (2) operation when break trigger and exception conflict with v r 5500 ver. 2.1 or later, if the data break control register in the debug module is set so that only a trigger occurs and if a break trigger and an address error exception or tlb exception occur in the same load/store instruction, the address error exception or tlb exception is indicated by the cause code. however, 0xbfc0 1000 for the debug exception is used as the exception vector address. (3) masking nmi request with v r 5500 ver. 2.1 or later, an nmi exception occurs even if occurrence of nmi is masked by the debug mode control register in the debug module when the nmi request is already held pending internally.
preliminary user?s manual u16044ej1v0um 636 [memo]
although nec has taken all possible steps to ensure that the documentation supplied to our customers is complete, bug free and up-to-date, we readily accept that errors may occur. despite all the care and precautions we've taken, you may encounter problems in the documentation. please complete this form whenever you'd like to report errors or suggest improvements to us. hong kong, philippines, oceania nec electronics hong kong ltd. fax: +852-2886-9022/9044 korea nec electronics hong kong ltd. seoul branch fax: +82-2-528-4411 p.r. china nec electronics shanghai, ltd. nec electronics taiwan ltd. fax: +86-21-6841-1137 address north america nec electronics inc. corporate communications dept. fax: +1-800-729-9288 +1-408-588-6130 europe nec electronics (europe) gmbh market communication dept. fax: +49-211-6503-274 south america nec do brasil s.a. fax: +55-11-6462-6829 taiwan asian nations except philippines nec electronics singapore pte. ltd. fax: +886-2-2719-5951 fax: +65-250-3583 japan nec semiconductor technical hotline fax: +81- 44-435-9608 i would like to report the following error/make the following suggestion: document title: document number: page number: thank you for your kind support. if possible, please fax the referenced page or drawing. excellent good acceptable poor document rating clarity technical accuracy organization cs 02.3 name company from: tel. fax facsimile message

▲Up To Search▲

Price & Availability of UPD30550

	To Download UPD30550 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .