View IBM32NPR161EPXCAD133_6695842.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

ibm powernp? np4gs3 network processor preliminary may 18, 2001
0.1 copyright and disclaimer copyright international business machines corporation 1999, 2001 all rights reserved us government users restricted rights - use, duplication or disclosure restricted by gsa adp schedule contract with ibm corp. printed in the united states of america may 2001 the following are trademarks of international business machines corporation in the united states, or other countries, or both. ibm ibm logo powerpc powernp other company, product and service names may be trademarks or service marks of others. all information contained in this document is subject to change without notice. the products described in this document are not intended for use in implantation or other life support applications where malfunction may result in injury or death to persons. the information contained in this document does not affect or change ibm product specifications or warran- ties. nothing in this document shall operate as an express or implied license or indemnity under the intellectual property rights of ibm or third parties. all information contained in this document was obtained in specific environments, and is presented as an illustration. the results obtained in other operating environments may vary. while the information contained herein is believed to be accurate, such information is preliminary, and should not be relied upon for accuracy or completeness, and no representations or warranties of accuracy or completeness are made. the information contained in this document is provided on an ? as is ? basis. in no event will ibm be liable for damages arising directly or indirectly from any use of the information contained in this document. ibm microelectronics division 1580 route 52, bldg. 504 hopewell junction, ny 12533-6351 the ibm home page can be found at http://www.ibm.com the ibm microelectronics division home page can be found at http://www.ibm.com/chips np3_dl_title.fm.08 may 18, 2001 note: this document contains information on products in the sampling and/or initial production phases of development. this information is subject to change without notice. verify with your ibm field applications engi- neer that you have the latest version of this document before finalizing a design.
ibm powernp np4gs3 preliminary network processor np3_dltoc.fm.08 may 18, 2001 page 1 of 554 contents about this book .......................................................................................................... 23 who should read this manual ........................................................................................................ .23 related publications ............................................................................................................ .............. 23 conventions used in this manual .................................................................................................... 23 1. general information .................................................................................................. 25 1.1 features .................................................................................................................... ..................... 25 1.2 ordering information ......................................................................................................... ........... 26 1.3 overview .................................................................................................................... .................... 27 1.4 np4gs3-based systems ......................................................................................................... ..... 27 1.5 structure ................................................................................................................... ..................... 29 1.5.1 epc structure .............................................................................................................. .......... 30 1.5.1.1 coprocessors ............................................................................................................ ..... 31 1.5.1.2 enhanced threads ......................................................................................................... 31 1.5.1.3 hardware accelerators ................................................................................................... 3 2 1.5.2 np4gs3 memory .............................................................................................................. .... 32 1.6 data flow .................................................................................................................... ................... 33 1.6.1 basic data flow ............................................................................................................. ........ 33 1.6.2 data flow in the epc .......................................................................................................... .. 34 2. physical description ................................................................................................. 37 2.1 pin information .............................................................................................................. ................ 38 2.1.1 packet routing switch interface pins .................................................................................... 39 2.1.2 flow control interface pins .................................................................................................. .41 2.1.3 zbt interface pins .......................................................................................................... ....... 42 2.1.4 ddr dram interface pins .................................................................................................... 44 2.1.4.1 d3, d2, d1, and d0 interface pins ................................................................................. 49 2.1.4.2 d4_0 and d4_1 interface pins ....................................................................................... 51 2.1.4.3 d6_x interface pins ....................................................................................................... .52 2.1.4.4 ds1 and ds0 interface pins .......................................................................................... 53 2.1.5 pmm interface pins .......................................................................................................... ..... 54 2.1.5.1 tbi bus pins .............................................................................................................. ..... 57 2.1.5.2 gmii bus pins ............................................................................................................. ... 61 2.1.5.3 smii bus pins ............................................................................................................. .... 63 2.1.5.4 pos bus pins .............................................................................................................. ... 66 2.1.7 management bus interface pins ........................................................................................... 75 2.1.8 miscellaneous pins ......................................................................................................... ....... 77 2.1.9 pll filter circuit .......................................................................................................... .......... 80 2.1.10 thermal i/o usage .......................................................................................................... .... 80 2.1.10.1 temperature calculation .............................................................................................. 81 2.1.10.2 measurement calibration ............................................................................................. 81 2.2 clocking domains ............................................................................................................. ............ 82 2.3 mechanical specifications .................................................................................................... ....... 84 2.4 signal pin lists .............................................................................................................. ............... 86 2.5 ieee 1149 (jtag) compliance ................................................................................................... 1 06 2.5.1 statement of jtag compliance .......................................................................................... 106
ibm powernp np4gs3 network processor preliminary page 2 of 554 np3_dltoc.fm.08 may 18, 2001 2.5.2 jtag compliance mode ...................................................................................................... 10 6 2.5.3 jtag implementation specifics ........................................................................................... 106 2.5.4 brief overview of jtag instructions .................................................................................... 106 3. physical mac multiplexer ....................................................................................... 109 3.1 ethernet overview ............................................................................................................ ........... 110 3.1.1 ethernet interface timing diagrams .................................................................................... 111 3.1.2 ethernet counters .......................................................................................................... ...... 113 3.1.3 ethernet support ........................................................................................................... ....... 118 3.2 pos overview ................................................................................................................. ............. 120 3.2.1 pos timing diagrams ......................................................................................................... 122 3.2.2 pos counters ............................................................................................................... ....... 128 3.2.3 pos support ................................................................................................................ ........ 131 4. ingress enqueuer / dequeuer / scheduler ............................................................ 133 4.1 overview .................................................................................................................... .................. 133 4.2 operation ................................................................................................................... .................. 134 4.2.1 operational details ........................................................................................................ ...... 136 4.3 ingress flow control .......................................................................................................... ......... 138 4.3.1 flow control hardware facilities ......................................................................................... 138 4.3.2 hardware function .......................................................................................................... ..... 140 4.3.2.1 exponentially weighted moving average (ewma) ...................................................... 140 4.3.2.2 flow control hardware actions .................................................................................... 140 5. switch interface ....................................................................................................... 141 5.1 ingress switch data mover ...................................................................................................... ... 143 5.1.1 cell header ................................................................................................................ .......... 144 5.1.2 frame header ............................................................................................................... ....... 147 5.2 ingress switch cell interface .................................................................................................. ... 149 5.2.1 idle cell format ............................................................................................................ ........ 149 5.2.1.1 crc bytes: word 15 .................................................................................................... 150 5.2.1.2 i-sci transmit header for the idle cell ........................................................................ 150 5.2.2 switch data cell format - ingress and egress .................................................................... 151 5.3 data-aligned synchronous link interface ................................................................................ 152 5.4 egress switch cell interface ................................................................................................... ... 153 5.4.1 master and multicast grant reporting ................................................................................. 153 5.4.2 output queue grant reporting ............................................................................................ 153 5.4.2.1 oqg reporting in external wrap mode ....................................................................... 154 5.4.3 switch fabric to network processor egress idle cell .......................................................... 156 5.4.4 receive header formats for sync cells .............................................................................. 158 5.5 egress switch data mover ....................................................................................................... .. 159 6. egress enqueuer / dequeuer / scheduler ............................................................. 161 6.2 operation ................................................................................................................... .................. 164 6.3 egress flow control ........................................................................................................... ........ 167 6.3.1 flow control hardware facilities ......................................................................................... 167 6.3.2 remote egress status bus .................................................................................................. 169 6.3.2.1 bus sequence and timing ........................................................................................... 169
ibm powernp np4gs3 preliminary network processor np3_dltoc.fm.08 may 18, 2001 page 3 of 554 6.3.2.2 configuration ........................................................................................................... ..... 171 6.3.3 hardware function .......................................................................................................... .... 172 6.3.3.1 exponentially weighted moving average ..................................................................... 172 6.3.3.2 flow control hardware actions .................................................................................... 172 6.4 the egress scheduler .......................................................................................................... ...... 173 6.4.1 egress scheduler components ........................................................................................... 175 6.4.1.1 scheduling calendars .................................................................................................. 175 6.4.1.2 flow queues .............................................................................................................. .. 176 6.4.1.3 target port queues ...................................................................................................... 17 7 6.4.2 configuring flow queues .................................................................................................... 1 78 6.4.2.1 additional configuration notes ..................................................................................... 179 7. embedded processor complex ............................................................................. 181 7.1 overview .................................................................................................................... .................. 181 7.1.1 thread types ............................................................................................................... ....... 185 7.2 dyadic protocol processor unit (dppu) ................................................................................... 186 7.2.1 core language processor (clp) ........................................................................................ 188 7.2.1.1 core language processor address map ..................................................................... 190 7.2.2 clp opcode formats .......................................................................................................... 192 7.2.3 dppu coprocessors .......................................................................................................... .192 7.2.4 shared memory pool .......................................................................................................... .193 7.3 clp opcode formats ............................................................................................................ ..... 194 7.3.1 control opcodes ............................................................................................................ ...... 194 7.3.1.1 nop opcode ............................................................................................................... .. 195 7.3.1.2 exit opcode .............................................................................................................. .... 195 7.3.1.3 branch and link opcode .............................................................................................. 196 7.3.1.4 return opcode ............................................................................................................ .197 7.3.1.5 branch register opcode .............................................................................................. 197 7.3.1.6 branch pc relative opcode ........................................................................................ 197 7.3.1.7 branch reg+off opcode .............................................................................................. 198 7.3.2 data movement opcodes .................................................................................................... 199 7.3.2.1 memory indirect opcode .............................................................................................. 203 7.3.2.2 memory address indirect opcode ................................................................................ 204 7.3.2.3 memory direct opcode ................................................................................................ 205 7.3.2.4 scalar access opcode ................................................................................................. 205 7.3.2.5 scalar immediate opcode ............................................................................................ 206 7.3.2.6 transfer quadword opcode ......................................................................................... 206 7.3.2.7 zero array opcode (np4gs3b (r2.0) only) ............................................................... 207 7.3.3 coprocessor execution opcodes ........................................................................................ 208 7.3.3.1 execute direct opcode ................................................................................................ 209 7.3.3.2 execute indirect opcode .............................................................................................. 210 7.3.3.3 execute direct conditional opcode ............................................................................. 210 7.3.3.4 execute indirect conditional opcode ........................................................................... 211 7.3.3.5 wait opcode .............................................................................................................. ... 211 7.3.3.6 wait and branch opcode ............................................................................................. 212 7.3.4 alu opcodes ................................................................................................................ ...... 213 7.3.4.1 arithmetic immediate opcode ...................................................................................... 213 7.3.4.2 logical immediate opcode ........................................................................................... 216 7.3.4.3 compare immediate opcode ....................................................................................... 216 7.3.4.4 load immediate opcode .............................................................................................. 217
ibm powernp np4gs3 network processor preliminary page 4 of 554 np3_dltoc.fm.08 may 18, 2001 7.3.4.5 arithmetic register opcode .......................................................................................... 220 7.3.4.6 count leading zeros opcode ...................................................................................... 222 7.4 dppu coprocessors ............................................................................................................ ....... 223 7.4.1 tree search engine coprocessor ........................................................................................ 224 7.4.2 data store coprocessor ...................................................................................................... 224 7.4.2.1 data store coprocessor address map ......................................................................... 225 7.4.2.2 data store coprocessor commands ............................................................................ 230 7.4.3 control access bus (cab) coprocessor ............................................................................. 239 7.4.3.1 cab coprocessor address map ................................................................................... 239 7.4.3.2 cab access to np4gs3 structures ............................................................................. 240 7.4.3.3 cab coprocessor commands ..................................................................................... 241 7.4.4 enqueue coprocessor ........................................................................................................ .242 7.4.4.1 enqueue coprocessor address map ............................................................................ 243 7.4.4.2 enqueue coprocessor commands .............................................................................. 250 7.4.5 checksum coprocessor ....................................................................................................... 256 7.4.5.1 checksum coprocessor address map ......................................................................... 256 7.4.5.2 checksum coprocessor commands ............................................................................ 257 7.4.6 string copy coprocessor ..................................................................................................... 261 7.4.6.1 string copy coprocessor address map ....................................................................... 261 7.4.6.2 string copy coprocessor commands .......................................................................... 261 7.4.7 policy coprocessor ......................................................................................................... ..... 262 7.4.7.1 policy coprocessor address map ................................................................................ 262 7.4.7.2 policy coprocessor commands ................................................................................... 262 7.4.8 counter coprocessor ........................................................................................................ ... 263 7.4.8.1 counter coprocessor address map ............................................................................. 263 7.4.8.2 counter coprocessor commands ................................................................................ 264 7.4.9 semaphore coprocessor ..................................................................................................... 2 66 7.4.9.1 semaphore coprocessor commands .......................................................................... 266 7.4.9.2 error conditions ......................................................................................................... ... 267 7.4.9.3 software use models ................................................................................................... 268 7.5 interrupts and timers ......................................................................................................... ........ 269 7.5.1 interrupts ................................................................................................................ .............. 269 7.5.1.1 interrupt vector registers ............................................................................................. 269 7.5.1.2 interrupt mask registers .............................................................................................. 269 7.5.1.3 interrupt target registers ............................................................................................. 269 7.5.1.4 software interrupt registers ......................................................................................... 269 7.5.2 timers .................................................................................................................... .............. 269 7.5.2.1 timer interrupt counters .............................................................................................. 270 7.6 dispatch unit ................................................................................................................ ............... 271 7.6.1 port configuration memory .................................................................................................. 2 73 7.6.1.1 port configuration memory index definition ................................................................. 273 7.6.2 port configuration memory contents definition ................................................................... 274 7.7 hardware classifier .......................................................................................................... ........... 275 7.7.1 ingress classification ..................................................................................................... ...... 275 7.7.1.1 ingress classification input ........................................................................................... 275 7.7.1.2 ingress classification output ........................................................................................ 276 7.7.2 egress classification ...................................................................................................... ...... 279 7.7.2.1 egress classification input ........................................................................................... 279 7.7.2.2 egress classification output ........................................................................................ 280
ibm powernp np4gs3 preliminary network processor np3_dltoc.fm.08 may 18, 2001 page 5 of 554 7.8 policy manager ............................................................................................................... ............. 281 7.9 counter manager .............................................................................................................. ........... 284 7.9.1 counter manager usage ..................................................................................................... 28 6 7.10 semaphore manager ........................................................................................................... ...... 292 8. tree search engine ................................................................................................ 295 8.1 overview .................................................................................................................... .................. 295 8.1.1 addressing control store (cs) ............................................................................................ 295 8.1.2 d6 control store. ........................................................................................................... ...... 296 8.1.3 logical memory views of d6 ............................................................................................... 297 8.1.4 control store use restrictions ............................................................................................ 298 8.1.5 object shapes .............................................................................................................. ....... 298 8.1.6 illegal memory access ....................................................................................................... .. 301 8.1.7 memory range checking (address bounds check) ........................................................... 302 8.2 trees and tree searches ........................................................................................................ .... 303 8.2.1 input key and color register for fm and lpm trees ......................................................... 304 8.2.2 input key and color register for smt trees ...................................................................... 304 8.2.3 direct table ............................................................................................................... .......... 305 8.2.3.1 pattern search control blocks (pscb) ........................................................................ 305 8.2.3.2 leaves and compare-at-end operation ...................................................................... 306 8.2.3.3 cascade/cache ........................................................................................................... .306 8.2.3.4 cache flag and nrpscbs registers ........................................................................... 306 8.2.3.5 cache management ..................................................................................................... 307 8.2.3.6 search output ............................................................................................................ .. 307 8.2.4 tree search algorithms ...................................................................................................... .307 8.2.4.1 fm trees ................................................................................................................. ..... 307 8.2.4.2 lpm trees ................................................................................................................ .... 308 8.2.4.3 smt trees ................................................................................................................ .... 308 8.2.4.4 compare-at-end operation .......................................................................................... 309 8.2.4.5 ropes ................................................................................................................... ........ 310 8.2.4.6 aging ................................................................................................................... ......... 311 8.2.5 tree configuration and initialization .................................................................................... 312 8.2.5.1 the ludeftable ........................................................................................................... 312 8.2.5.2 tse free lists (tse_fl) ............................................................................................. 314 8.2.6 tse registers and register map ........................................................................................ 315 8.2.7 tse instructions ........................................................................................................... ....... 320 8.2.7.1 fm tree search (ts_fm) ............................................................................................ 321 8.2.7.2 lpm tree search (ts_lpm) ........................................................................................ 322 8.2.7.3 smt tree search (ts_smt) ....................................................................................... 324 8.2.7.4 memory read (mrd) ................................................................................................... 325 8.2.7.5 memory write (mwr) ................................................................................................... 326 8.2.7.6 hash key (hk) ............................................................................................................. 327 8.2.7.7 read ludeftable (rdludef) .................................................................................... 328 8.2.7.8 compare-at-end (compend) ..................................................................................... 329 8.2.7.9 distinguishposition for fast table update (distpos_gdh) ...................................... 330 8.2.7.10 read pscb for fast table update (rdpscb_gdh) ................................................ 332 8.2.7.11 write pscb for fast table update (wrpscb_gdh) ............................................... 333 8.2.7.12 setpatbit_gdh .......................................................................................................... 334 8.2.8 gth hardware assist instructions ...................................................................................... 336 8.2.8.1 hash key gth (hk_gth) ........................................................................................... 336
ibm powernp np4gs3 network processor preliminary page 6 of 554 np3_dltoc.fm.08 may 18, 2001 8.2.8.2 read ludeftable gth (rdludef gth) ................................................................... 337 8.2.8.3 tree search enqueue free list (tsenqfl) ............................................................... 338 8.2.8.4 tree search dequeue free list (tsdqfl) .................................................................. 338 8.2.8.5 read current leaf from rope (rclr) ......................................................................... 339 8.2.8.6 advance rope with optional delete leaf (ardl) ........................................................ 340 8.2.8.7 tree leaf insert rope (tlir) ....................................................................................... 340 8.2.8.8 clear pscb (clrpscb) .............................................................................................. 341 8.2.8.9 read pscb (rdpscb) ................................................................................................ 341 8.2.8.10 write pscb (wrpscb) ............................................................................................. 342 8.2.8.11 push pscb (pushpscb) ......................................................................................... 343 8.2.8.12 distinguish (distpos) ............................................................................................... 343 8.2.8.13 tsr0 pattern (tsr0pat) .......................................................................................... 344 8.2.8.14 pattern 2dta (pat2dta) .......................................................................................... 344 8.2.9 hash functions ............................................................................................................. ....... 345 9. serial / parallel manager interface ......................................................................... 355 9.1 spm interface components ...................................................................................................... .. 355 9.2 spm interface data flow ........................................................................................................ ..... 356 9.3 spm interface protocol ........................................................................................................ ....... 358 9.4 spm cab address space .......................................................................................................... .360 9.4.2 word access space ........................................................................................................... .. 360 9.4.3.1 eeprom single-byte access ...................................................................................... 361 9.4.3.2 eeprom 2-byte access .............................................................................................. 362 9.4.3.3 eeprom 3-byte access .............................................................................................. 362 9.4.3.4 eeprom 4-byte access .............................................................................................. 363 10. embedded powerpc? subsystem ...................................................................... 365 10.1 description ................................................................................................................ ................. 365 10.2 processor local bus and device control register buses .................................................... 366 10.3 universal interrupt controller (uic) ......................................................................................... 3 67 10.4 pci/plb macro ............................................................................................................... ............ 368 10.5 plb address map .............................................................................................................. ........ 371 10.6 cab address map .............................................................................................................. ....... 372 10.7 cab interface macro .......................................................................................................... ....... 374 10.7.1 powerpc cab address (pwrpc_cab_addr) register ..................................................... 376 10.7.2 powerpc cab data (pwrpc_cab_data) register ........................................................... 376 10.7.3 powerpc cab control (pwrpc_cab_cntl) register ........................................................ 377 10.7.4 powerpc cab status (pwrpc_cab_status) register ...................................................... 378 10.7.5 powerpc cab mask (pwrpc_cab_mask) register [np4gs3b (r2.0) only] .................. 379 10.7.6 powerpc cab write under mask data [np4gs3b (r2.0) only] ...................................... 380 10.7.7 pci host cab address (host_cab_addr) register .......................................................... 381 10.7.8 pci host cab data (host_cab_data) register ............................................................... 381 10.7.9 pci host cab control (host_cab_cntl) register ............................................................. 382 10.7.10 pci host cab status (host_cab_status) register ........................................................ 383 10.7.11 pci host cab mask (host_cab_mask) register (np4gs3b (r2.0) only) ..................... 384 10.7.12 pci host cab write under mask data register (np4gs3b (r2.0) only) ....................... 384 10.8 mailbox communications and dram interface macro .......................................................... 385 10.8.1 mailbox communications between pci host and powerpc subsystem .......................... 385 10.8.2 pci interrupt status (pci_interr_status) register ............................................................. 387
ibm powernp np4gs3 preliminary network processor np3_dltoc.fm.08 may 18, 2001 page 7 of 554 10.8.3 pci interrupt enable (pci_interr_ena) register ............................................................... 388 10.8.4 powerpc subsystem to pci host message resource (p2h_msg_resource) register .. 389 10.8.5 powerpc subsystem to host message address (p2h_msg_addr) register ................... 389 10.8.6 powerpc subsystem to host doorbell (p2h_doorbell) register ...................................... 390 10.8.7 host to powerpc subsystem message address (h2p_msg_addr) register ................... 391 10.8.8 host to powerpc subsystem doorbell (h2p_doorbell) register ...................................... 392 10.8.9 mailbox communications between powerpc subsystem and epc ................................. 393 10.8.10 epc to powerpc subsystem resource (e2p_msg_resource) register ....................... 394 10.8.11 epc to powerpc subsystem message address (e2p_msg_addr) register ................. 395 10.8.12 epc to powerpc subsystem doorbell (e2p_doorbell) register .................................... 396 10.8.13 epc interrupt vector register ......................................................................................... 398 10.8.14 epc interrupt mask register ........................................................................................... 398 10.8.15 powerpc subsystem to epc message address (p2e_msg_addr) register ................. 399 10.8.16 powerpc subsystem to epc doorbell (p2e_doorbell) register .................................... 400 10.8.17 mailbox communications between pci host and epc ................................................... 401 10.8.18 epc to pci host resource (e2h_msg_resource) register .......................................... 402 10.8.19 epc to pci host message address (e2h_msg_addr) register ..................................... 403 10.8.20 epc to pci host doorbell (e2h_doorbell) register ....................................................... 404 10.8.21 pci host to epc message address (h2e_msg_addr) register ..................................... 406 10.8.22 pci host to epc doorbell (h2e_doorbell) register ....................................................... 407 10.8.23 message status (msg_status) register .......................................................................... 409 10.8.24 powerpc boot redirection instruction registers (boot_redir_inst) ............................... 411 10.8.25 powerpc machine check (pwrpc_mach_chk) register ............................................... 412 10.8.26 parity error status and reporting .................................................................................... 413 10.8.27 slave error address register (sear) ............................................................................. 413 10.8.28 slave error status register (sesr) ................................................................................ 414 10.8.29 parity error counter (perr_count) register .................................................................... 415 10.9 system start-up and initialization ........................................................................................... 4 16 10.9.1 np4gs3 resets ............................................................................................................. ... 416 10.9.2 systems initialized by external pci host processors ....................................................... 417 10.9.3 systems with pci host processors and initialized by powerpc subsystem .................... 418 10.9.4 systems without pci host processors and initialized by powerpc subsystem .............. 419 10.9.5 systems without pci host or delayed pci configuration and initialized by epc ............ 420 11. reset and initialization ......................................................................................... 421 11.1 overview ................................................................................................................... ................. 421 11.2 step 1: set i/os .............................................................................................................. ............ 423 11.3 step 2: reset the np4gs3 ....................................................................................................... .424 11.4 step 3: boot ................................................................................................................. .............. 425 11.4.1 boot the embedded processor complex (epc) ................................................................ 425 11.4.2 boot the powerpc ........................................................................................................... .. 425 11.4.3 boot summary .............................................................................................................. ..... 425 11.5 step 4: setup 1 ............................................................................................................... ........... 426 11.6 step 5: diagnostics 1 ......................................................................................................... ....... 427 11.7 step 6: setup 2 ............................................................................................................... ........... 428 11.8 step 7: hardware initialization ............................................................................................... .. 429 11.9 step 8: diagnostics 2 ......................................................................................................... ....... 430 11.10 step 9: operational ......................................................................................................... ........ 431 11.11 step 10: configure .......................................................................................................... ........ 432
ibm powernp np4gs3 network processor preliminary page 8 of 554 np3_dltoc.fm.08 may 18, 2001 11.12 step 11: initialization complete ............................................................................................. 434 12. debug facilities ..................................................................................................... 435 12.1 debugging picoprocessors .................................................................................................... .. 435 12.1.1 single step ............................................................................................................... .......... 435 12.1.2 break points .............................................................................................................. ......... 435 12.1.3 cab accessible registers ................................................................................................. 43 5 12.2 riscwatch .................................................................................................................. ............... 436 13. configuration ......................................................................................................... 437 13.1 memory configuration ........................................................................................................ ...... 437 13.1.1 memory configuration register (memory_config) ............................................................ 438 13.1.2 dram parameter register (dram_parm) ....................................................................... 440 13.2 master grant mode register (mg_mode) ................................................................................ 445 13.3 tb mode register (tb_mode) .................................................................................................. 44 6 13.4 egress reassembly sequence check register (e_reassembly_seq_ck) ......................... 447 13.5 aborted frame reassembly action control register (afrac) .......................................... 448 13.6 packing registers ........................................................................................................... .......... 449 13.6.1 packing control register (pack_ctrl) ................................................................................ 449 13.6.2 packing delay register (pack_dly) (np4gs3b (r2.0)) .................................................... 450 13.7 initialization control registers ............................................................................................. ... 451 13.7.1 initialization register (init) ............................................................................................. .... 451 13.7.2 initialization done register (init_done) ............................................................................ 452 13.8 np4gs3 ready register (npr_ready) .................................................................................. 453 13.9 phase locked loop registers ................................................................................................. 45 4 13.9.1 phase locked loop fail register (pll_lock_fail) ........................................................... 454 13.10 software controlled reset register (soft_reset) ............................................................... 456 13.11 ingress free queue threshold configuration ...................................................................... 457 13.11.1 bcb_fq threshold registers ......................................................................................... 457 13.11.2 bcb_fq threshold for guided traffic (bcb_fq_th_gt) ............................................. 457 13.11.3 bcb_fq_threshold_0 / _1 / _2 registers (bcb_fq_th_0/_1/_2) ................................ 458 13.12 ingress target dmu data storage map register (i_tdmu_dsu) ...................................... 459 13.13 embedded processor complex configuration ..................................................................... 460 13.13.1 powerpc core reset register (powerpc_reset) .......................................................... 460 13.13.2 powerpc boot redirection instruction registers (boot_redir_inst) ............................... 461 13.13.3 watch dog reset enable register (wd_reset_ena) ..................................................... 462 13.13.4 boot override register (boot_override) .......................................................................... 463 13.13.5 thread enable register (thread_enable) ...................................................................... 464 13.13.6 gfh data disable register (gfh_data_dis) .................................................................. 465 13.13.7 ingress maximum dcb entries (i_max_dcb) ................................................................. 466 13.13.8 egress maximum dcb entries (e_max_dcb) ................................................................ 467 13.13.9 my target blade address register (my_tb) ................................................................... 468 13.13.10 local target blade vector register (local_tb_vector) ................................................ 469 13.13.11 local mctarget blade vector register (local_mc_tb_max) ...................................... 470 13.13.12 ordered semaphore enable register (ordered_sem_ena) (np4gs3b (r2.0)) .......... 471 13.14 flow control structures ..................................................................................................... .... 472 13.14.1 ingress flow control hardware structures ...................................................................... 472 13.14.1.1 ingress transmit probability memory register (i_tx_prob_mem) ........................... 472
ibm powernp np4gs3 preliminary network processor np3_dltoc.fm.08 may 18, 2001 page 9 of 554 13.14.1.2 ingress pseudo-random number register (i_rand_num) ..................................... 473 13.14.1.3 free queue thresholds register (fq_th) .............................................................. 474 13.14.2 egress flow control structures ....................................................................................... 475 13.14.2.1 egress transmit probability memory (e_tx_prob_mem) register .......................... 475 13.14.2.2 egress pseudo-random number (e_rand_num) .................................................. 476 13.14.2.3 p0 twin count threshold (p0_twin_th) ................................................................. 477 13.14.2.4 p1 twin count threshold (p1_twin_th) ................................................................. 477 13.14.2.5 egress p0 twin count ewma threshold register (e_p0_twin_ewma_th) ......... 478 13.14.2.6 egress p1 twin count ewma threshold register (e_p1_twin_ewma_th) ......... 478 13.14.3 exponentially weighted moving average constant (k) register (ewma_k) ................. 479 13.14.4 exponentially weighted moving average sample period (t) register (ewma_t) ......... 480 13.14.5 remote egress status bus configuration enables (res_data_cnf) ............................. 481 13.15 target port data storage map (tp_ds_map) register ...................................................... 482 13.16 egress sdm stack threshold register (e_sdm_stack_th) .............................................. 485 13.17 free queue extended stack maximum size (fq_es_max) register ................................ 486 13.18 egress free queue thresholds ............................................................................................. 487 13.18.1 fq_es_threshold_0 register (fq_es_th_0) .............................................................. 487 13.18.2 fq_es_threshold_1 register (fq_es_th_1) .............................................................. 488 13.18.3 fq_es_threshold_2 register (fq_es_th_2) .............................................................. 488 13.19 discard flow qcb register (discard_qcb) ......................................................................... 489 13.20 bandwidth allocation register (bw_alloc_reg, np4gs3b (r2.0)) ................................... 490 13.21 frame control block fq size register (fcb_fq_max) ..................................................... 491 13.22 data mover unit (dmu) configuration ................................................................................... 492 13.23 qd accuracy register (qd_acc) ........................................................................................... 496 13.24 packet over sonet control register (pos_ctrl) ................................................................ 497 13.25 packet over sonet maximum frame size (pos_max_fs) ................................................ 498 13.26 ethernet encapsulation type register for control (e_type_c) ......................................... 499 13.27 ethernet encapsulation type register for data (e_type_d) .............................................. 500 13.28 source address array (sa_array) ....................................................................................... 501 13.29 dasl initialization and configuration ................................................................................... 502 13.29.1 dasl configuration register (dasl_config) ................................................................ 502 13.29.1.1 dynamic switch interface selection ......................................................................... 504 13.29.2 dasl bypass and wrap register (dasl_bypass_wrap) .............................................. 506 13.29.3 dasl start register (dasl_start) ................................................................................. 507 13.30 programmable i/o register (pio_reg) (np4gs3b (r2.0)) .................................................. 508 14. electrical and thermal specifications ................................................................ 509 14.1 driver specifications ....................................................................................................... ......... 531 14.2 receiver specifications ..................................................................................................... ....... 533 14.3 other driver and receiver specifications .............................................................................. 535 14.3.1 dasl specifications ....................................................................................................... ... 537 15. glossary of terms and abbreviations ................................................................ 539
ibm powernp np4gs3 network processor preliminary page 10 of 554 np3_dltoc.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dllof.fm.08 may 18, 2001 page 11 of 554 list of figures figure 1-1. function placement in an np4gs3-based system ................................................................... 28 figure 1-2. np4gs3 major functional blocks .......................................................................................... .... 30 figure 1-3. data flow overview ..................................................................................................... .............. 33 figure 1-4. basic data flow in the epc ................................................................................................ ....... 34 figure 2-1. device interfaces ..................................................................................................... .................. 37 figure 2-2. zbt sram timing diagram ................................................................................................. ...... 43 figure 2-3. ddr control timing diagram .............................................................................................. ...... 45 figure 2-4. ddr read timing diagram ................................................................................................. ...... 46 figure 2-5. ddr write output timing diagram .......................................................................................... .. 47 figure 2-6. np4gs3 dmu bus clock connections ...................................................................................... 56 figure 2-7. np4gs3 dmu bus clock connections (pos overview) ........................................................... 57 figure 2-8. tbi timing diagram ..................................................................................................... .............. 59 figure 2-9. gmii timing diagram .................................................................................................... ............. 62 figure 2-10. smii timing diagram ................................................................................................... ............ 64 figure 2-11. pos transmit timing diagram ............................................................................................ .... 69 figure 2-12. pos receive timing diagram ............................................................................................. .... 70 figure 2-13. pci timing diagram .................................................................................................... ............. 74 figure 2-14. spm bus timing diagram ................................................................................................. ....... 76 figure 2-15. pll filter circuit diagram ............................................................................................. ........... 80 figure 2-16. thermal monitor ...................................................................................................... ................. 80 figure 2-17. clock generation and distribution ...................................................................................... ..... 82 figure 2-18. pins diagram ......................................................................................................... ................... 83 figure 2-19. mechanical diagram ................................................................................................... ............. 84 figure 3-1. pmm overview .......................................................................................................... ............... 109 figure 3-2. ethernet mode ......................................................................................................... ................. 110 figure 3-3. smii timing diagram .................................................................................................... ........... 111 figure 3-4. gmii timing diagram .................................................................................................... ........... 111 figure 3-5. tbi timing diagram ..................................................................................................... ............ 112 figure 3-6. gmii pos mode timing diagram ............................................................................................ 1 13 figure 3-7. oc-3c / oc-12 / oc-12c configuration ................................................................................... 120 figure 3-8. oc-48 configuration ................................................................................................... ............ 121 figure 3-9. oc-48c configuration .................................................................................................. ............ 121 figure 3-10. receive pos8 interface timing for 8-bit data bus (oc-3c) .................................................. 122 figure 3-11. transmit pos8 interface timing for 8-bit data bus (oc-3c) ................................................. 123 figure 3-12. receive pos8 interface timing for 8-bit data bus (oc-12c) ................................................ 124 figure 3-13. transmit pos8 interface timing for 8-bit data bus (oc-12c) ............................................... 125 figure 3-14. receive pos32 interface timing for 32-bit data bus (oc-48c) ............................................ 126 figure 3-15. transmit pos32 interface timing for 32-bit data bus (oc-48c) ........................................... 127
ibm powernp np4gs3 network processor preliminary page 12 of 554 np3_dllof.fm.08 may 18, 2001 figure 4-1. logical organization of ingress eds data flow management ................................................134 figure 4-2. sof ring structure ..................................................................................................... .............136 figure 4-3. ingress eds logical structure ........................................................................................... ......137 figure 5-1. switch interface functional blocks ...................................................................................... .....142 figure 5-2. cell header format ..................................................................................................... .............144 figure 5-3. frame header format .................................................................................................... ..........147 figure 5-4. crc calculation example ................................................................................................ ........150 figure 5-5. external wrap mode (two np4gs3 interconnected) ...............................................................155 figure 5-6. external wrap mode (single np4gs3 configuration) ...............................................................156 figure 6-1. egress eds functional blocks ............................................................................................ .....162 figure 6-2. cell formats and storage in the egress ds .............................................................................165 figure 6-3. tpq, fcb, and egress frame example ..................................................................................166 figure 6-4. res bus timing ......................................................................................................... ..............170 figure 6-5. hub-based res bus configuration to support more than two np4gs3 s .................................171 figure 6-6. the egress scheduler ................................................................................................... ...........174 figure 7-1. embedded processor complex block diagram .......................................................................184 figure 7-2. dyadic protocol processor unit functional blocks (np4gs3a (r1.1)) ....................................187 figure 7-3. dyadic protocol processor unit functional blocks (np4gs3b (r2.0)) ....................................188 figure 7-4. core language processor ................................................................................................ ........190 figure 7-5. ot5 field definition: loading halfword/word gprs from a halfword/word array ....................199 figure 7-6. ot5 field definition: loading gpr byte from array byte .........................................................200 figure 7-7. ot5 field definition: loading gpr halfword/word from array byte .........................................201 figure 7-8. ot5 field definition: store gpr byte/halfword/word to array byte/halfword/word .................202 figure 7-9. ot3i field definition .................................................................................................. .................215 figure 7-10. ot2i field definition: compare halfword/word immediate ......................................................217 figure 7-11. ot4i field definition load immediate halfword/word .............................................................218 figure 7-12. ot4i field definition: load immediate byte .............................................................................21 9 figure 7-13. ot3r field definition ................................................................................................. ................221 figure 7-14. a frame in the ingress data store ......................................................................................... 227 figure 7-15. frame in the egress data store ........................................................................................... .228 figure 7-16. ingress fcbpage format ................................................................................................ .......244 figure 7-17. egress fcbpage format ................................................................................................. ......246 figure 7-18. dispatch unit ........................................................................................................ ..................271 figure 7-19. split between picocode and hardware for the policy manager ..............................................281 figure 7-20. counter manager block diagram .......................................................................................... .285 figure 7-21. counter definition entry .............................................................................................. ...........288 figure 7-22. counter blocks and sets ................................................................................................ ........289 figure 8-1. example shaping dimensions ............................................................................................. .....300 figure 8-2. effects of using a direct table ........................................................................................... ......305
ibm powernp np4gs3 preliminary network processor np3_dllof.fm.08 may 18, 2001 page 13 of 554 figure 8-3. example input key and leaf pattern fields ............................................................................. 309 figure 8-4. rope structure ........................................................................................................ ................. 311 figure 8-5. general layout of tse fields in shared memory pool ........................................................... 321 figure 8-6. general layout of tse rdludef in shared memory pool .................................................... 328 figure 8-7. shared memory pool with distpos_gdh command subfields ............................................ 331 figure 8-8. shared memory pool with pscb subfields .............................................................................. 332 figure 8-9. no-hash function ...................................................................................................... .............. 345 figure 8-10. 192-bit ip hash function ............................................................................................... ........ 346 figure 8-11. mac hash function ..................................................................................................... .......... 347 figure 8-12. network dispatcher hash function ....................................................................................... .348 figure 8-13. 48-bit mac hash function ............................................................................................... ...... 349 figure 8-14. 60-bit mac hash function ............................................................................................... ...... 350 figure 8-15. 8-bit hash function ................................................................................................... ............. 351 figure 8-16. 12-bit hash function .................................................................................................. ............ 352 figure 8-17. 16 bit hash function ................................................................................................... ........... 353 figure 9-1. spm interface block diagram ............................................................................................. ..... 355 figure 9-2. epc boot image in external eeprom .................................................................................... 357 figure 9-3. spm bit timing ......................................................................................................... ............... 358 figure 9-4. spm interface write protocol ............................................................................................ ....... 358 figure 9-5. spm interface read protocol ............................................................................................. ...... 359 figure 10-1. powerpc subsystem block diagram ..................................................................................... 365 figure 10-2. polled access flow diagram ............................................................................................. .... 375 figure 11-1. system environments .................................................................................................. .......... 422 figure 13-1. np4gs3 memory subsystems .............................................................................................. 437 figure 14-1. 3.3 v lvttl / 5 v tolerant bp33 and ip33 receiver input current/voltage curve .............. 534
ibm powernp np4gs3 network processor preliminary page 14 of 554 np3_dllof.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dllot.fm.08 may 18, 2001 page 15 of 554 list of tables table 2-1. signal pin functions .................................................................................................... ................ 38 table 2-2. ibm 28.4 gbps packet routing switch interface pins ................................................................ 39 table 2-3. flow control pins ....................................................................................................... ................. 41 table 2-4. z0 zbt sram interface pins ................................................................................................ ...... 42 table 2-5. z1 zbt sram interface pins ................................................................................................ ...... 42 table 2-6. zbt sram timing diagram legend (for figure 2-2 ) .................................................................. 44 table 2-7. ddr timing diagram legend (for figure 2-3 , figure 2-4 ,and figure 2-5 ) ................................ 48 table 2-8. ddr timing diagram legend (for figure 2-3 , figure 2-4 ,and figure 2-5 ) ................................ 49 table 2-9. d3, d2, and d1 interface pins .............................................................................................. ....... 49 table 2-10. d0 memory pins ......................................................................................................... ............... 51 table 2-11. d4_0 and d4_1 interface pins ............................................................................................. ..... 51 table 2-12. d6_5, d6_4, d6_3, d6_2, d6_1, and d6_0 memory pins ........................................................ 52 table 2-13. ds1 and ds0 interface pins ............................................................................................... ...... 53 table 2-14. pmm interface pins ..................................................................................................... .............. 54 table 2-15. pmm interface pin multiplexing .......................................................................................... ....... 55 table 2-16. parallel data bit to 8b/10b position mapping (tbi interface) ................................................... 57 table 2-17. pmm interface pins: tbi mode ............................................................................................. .... 58 table 2-18. tbi timing diagram legend (for figure 2-8 ) ............................................................................ 60 table 2-19. pmm interface pins: gmii mode ............................................................................................ ... 61 table 2-20. gmii timing diagram legend (for figure 2-9 ) .......................................................................... 63 table 2-21. pmm interface pins: smii mode ............................................................................................ .... 63 table 2-22. smii timing diagram legend (for figure 2-10 ) ......................................................................... 65 table 2-23. pmm interface pins pos32 mode ............................................................................................ 66 table 2-24. pos signals ........................................................................................................... ................... 67 table 2-25. pos timing diagram legend (for figure 2-11 and figure 2-12 ) .............................................. 71 table 2-26. pci pins .............................................................................................................. ...................... 72 table 2-27. pci timing diagram legend (for figure 2-13 ) .......................................................................... 75 table 2-28. management bus pins .................................................................................................... .......... 75 table 2-29. spm bus timing diagram legend (for figure 2-14 ) ................................................................. 76 table 2-30. miscellaneous pins .................................................................................................... ................ 77 table 2-31. signals requiring pull-up or pull-down ................................................................................... .79 table 2-32. mechanical specifications ............................................................................................. ............ 85 table 2-33. complete signal pin listing by signal name ............................................................................ 86 table 2-34. complete signal pin listing by grid position ............................................................................ 96 table 2-35. jtag compliance-enable inputs .......................................................................................... .. 106 table 2-36. implemented jtag public instructions ................................................................................... 1 06 table 3-1. ingress ethernet counters ............................................................................................... ......... 114 table 3-2. egress ethernet counters ................................................................................................ ......... 116
ibm powernp np4gs3 network processor preliminary page 16 of 554 np3_dllot.fm.08 may 18, 2001 table 3-3. ethernet support ....................................................................................................... .................118 table 3-4. dmu and framer configurations ............................................................................................ ...120 table 3-5. receive counter ram addresses for ingress pos mac ........................................................128 table 3-6. transmit counter ram addresses for egress pos mac ........................................................129 table 3-7. pos support ............................................................................................................ ..................131 table 4-1. flow control hardware facilities ......................................................................................... ......139 table 5-1. cell header fields ...................................................................................................... ...............145 table 5-2. frame header fields ..................................................................................................... ............148 table 5-3. idle cell format transmitted to the switch interface .................................................................149 table 5-4. switch data cell format .................................................................................................. ..........151 table 5-5. receive cell header byte h0 for an idle cell ............................................................................157 table 5-6. idle cell format received from the switch interface - 16-blade mode .....................................157 table 5-7. idle cell format received from the switch interface - 64-blade mode .....................................158 table 6-1. flow control hardware facilities ......................................................................................... ......168 table 6-2. flow queue parameters ................................................................................................... .........173 table 6-3. valid combinations of scheduler parameters ...........................................................................173 table 6-4. configure a flow qcb ..................................................................................................... ..........178 table 7-1. core language processor address map ...................................................................................190 table 7-2. shared memory pool ...................................................................................................... ...........193 table 7-3. condition codes (cond field) ............................................................................................. ......194 table 7-4. aluop field definition .................................................................................................. ..............214 table 7-5. lop field definition .................................................................................................... ................216 table 7-6. arithmetic opcode functions ............................................................................................. ........220 table 7-7. coprocessor instruction format .......................................................................................... ......224 table 7-8. data store coprocessor address map ......................................................................................22 5 table 7-9. ingress datapool byte address definitions ...............................................................................22 6 table 7-10. egress frames datapool quadword addresses .....................................................................228 table 7-11. datapool byte addressing with cell header skip ...................................................................229 table 7-12. number of frame-bytes in the datapool .................................................................................230 table 7-13. wreds input ........................................................................................................... ...............232 table 7-14. wreds output .......................................................................................................... ..............232 table 7-15. rdeds input ........................................................................................................... ................233 table 7-16. rdeds output .......................................................................................................... ..............233 table 7-17. wrids input ........................................................................................................... .................234 table 7-18. wrids output .......................................................................................................... ...............234 table 7-19. rdids input ........................................................................................................... ..................234 table 7-20. rdids output .......................................................................................................... ................235 table 7-21. rdmoree input ......................................................................................................... ............235 table 7-22. rdmoree output ........................................................................................................ ...........236
ibm powernp np4gs3 preliminary network processor np3_dllot.fm.08 may 18, 2001 page 17 of 554 table 7-23. rdmorei input ......................................................................................................... ............. 236 table 7-24. rdmorei output ........................................................................................................ ............ 237 table 7-25. leasetwin output ...................................................................................................... .......... 237 table 7-26. edirty inputs ......................................................................................................... ............... 238 table 7-27. edirty output ......................................................................................................... .............. 238 table 7-28. idirty inputs ......................................................................................................... ................. 239 table 7-29. idirty output ......................................................................................................... ................ 239 table 7-30. cab coprocessor address map ............................................................................................. 239 table 7-31. cab address field definitions ........................................................................................... ..... 240 table 7-32. cab address, functional island encoding .............................................................................. 240 table 7-33. cabarb input .......................................................................................................... ............... 241 table 7-34. cabaccess input ....................................................................................................... .......... 242 table 7-35. cabaccess output ...................................................................................................... ........ 242 table 7-36. enqueue coprocessor address map ...................................................................................... 243 table 7-37. ingress fcbpage description ............................................................................................ ..... 244 table 7-38. egress fcbpage description ............................................................................................. ..... 247 table 7-39. enqe target queues ..................................................................................................... ........ 251 table 7-40. egress target queue selection coding .................................................................................. 251 table 7-41. egress target queue parameters .......................................................................................... 252 table 7-42. type field for discard queue ............................................................................................. .... 252 table 7-43. enqe command input ..................................................................................................... ....... 252 table 7-44. egress queue class definitions .......................................................................................... .... 253 table 7-45. enqi target queues ..................................................................................................... .......... 253 table 7-46. ingress target queue selection coding ................................................................................. 254 table 7-47. ingress target queue fcbpage parameters ......................................................................... 254 table 7-48. enqi command input ..................................................................................................... ........ 254 table 7-49. ingress-queue class definition ......................................................................................... ...... 255 table 7-50. enqclr command input ................................................................................................... .... 255 table 7-51. enqclr output ......................................................................................................... ............. 255 table 7-52. release_label output .................................................................................................. ..... 256 table 7-53. checksum coprocessor address map .................................................................................... 256 table 7-54. gengen/gengenx command inputs ................................................................................. 258 table 7-55. gengen/gengenx/genip/genipx command outputs .................................................... 258 table 7-56. genip/genipx command inputs .......................................................................................... 25 9 table 7-57. chkgen/chkgenx command inputs .................................................................................. 259 table 7-58. chkgen/chkgenx/chkip/chkipx command outputs ..................................................... 260 table 7-59. chkip/chkipx command inputs ........................................................................................... 2 60 table 7-60. string copy coprocessor address map .................................................................................. 261 table 7-61. strcopy command input .................................................................................................. ....... 261
ibm powernp np4gs3 network processor preliminary page 18 of 554 np3_dllot.fm.08 may 18, 2001 table 7-62. strcopy command output ................................................................................................. .....262 table 7-63. policy coprocessor address map .......................................................................................... ..262 table 7-64. polaccess input ....................................................................................................... ................263 table 7-65. polaccess output ...................................................................................................... ..............263 table 7-66. counter coprocessor address map ........................................................................................2 63 table 7-67. ctrinc input .......................................................................................................... ....................264 table 7-68. ctradd input .......................................................................................................... ...................265 table 7-69. ctrrd/ctrrdclr input .................................................................................................. .............265 table 7-70. ctrrd/ctrrdclr output ................................................................................................. ............265 table 7-71. ctrwr15_0/ctrwr31_16 input ............................................................................................ ......266 table 7-72. semaphore lock input ................................................................................................... ..........267 table 7-73. semaphore unlock input ................................................................................................. ........267 table 7-74. reservation release input .............................................................................................. ........267 table 7-75. priority assignments for the dispatch unit queue arbiter .......................................................272 table 7-76. port configuration memory index ......................................................................................... ...273 table 7-77. relationship between sp field, queue, and port configuration memory index .....................274 table 7-78. port configuration memory content ....................................................................................... .274 table 7-79. protocol identifiers .................................................................................................. .................276 table 7-80. hccia table ........................................................................................................... .................277 table 7-81. protocol identifiers for frame encapsulation types ................................................................277 table 7-82. general purpose register bit definitions for ingress classification flags ..............................278 table 7-83. flow control information values ......................................................................................... .....279 table 7-84. hccia index definition ................................................................................................. ...........280 table 7-85. general purpose register 1 bit definitions for egress classification flags ............................280 table 7-86. polcb field definitions ................................................................................................ ............282 table 7-87. counter manager components ............................................................................................. ..286 table 7-88. counter types ......................................................................................................... ................286 table 7-89. counter actions ....................................................................................................... ................286 table 7-90. counter definition entry format ......................................................................................... .....287 table 7-91. counter manager passed parameters ....................................................................................290 table 7-92. counter manager use of address bits .....................................................................................29 0 table 8-1. control store address mapping for tse references ................................................................295 table 8-2. cs address map and use .................................................................................................... .....295 table 8-3. plb and d6 control store addressing ......................................................................................29 7 table 8-4. dtentry, pscb, and leaf shaping ........................................................................................... .298 table 8-5. height, width, and offset restrictions for tse objects ............................................................301 table 8-6. fm and lpm tree fixed leaf formats ......................................................................................303 table 8-7. smt tree fixed leaf formats ............................................................................................... ....303 table 8-8. search input parameters ................................................................................................. ..........304
ibm powernp np4gs3 preliminary network processor np3_dllot.fm.08 may 18, 2001 page 19 of 554 table 8-9. cache status registers .................................................................................................. ........... 306 table 8-10. search output parameters ............................................................................................... ....... 307 table 8-11. dtentry and pscbline formats ............................................................................................ 308 table 8-12. lpm dtentry and pscbline formats .................................................................................... 308 table 8-13. nlasmt field format .................................................................................................... ......... 309 table 8-14. compdeftable entry format .............................................................................................. .... 310 table 8-15. ludeftable rope parameters ............................................................................................. ... 311 table 8-16. nlarope field format ................................................................................................... ......... 312 table 8-17. ludeftable entry definitions ........................................................................................... ....... 312 table 8-18. free list entry definition .............................................................................................. ........... 315 table 8-19. tse scalar registers for gth only ........................................................................................ 3 15 table 8-20. tse array registers for all gxh ........................................................................................... .. 317 table 8-21. tse registers for gth (tree management) ........................................................................... 317 table 8-22. tse scalar registers for gdh and gth ................................................................................ 317 table 8-23. pscb register format ................................................................................................... ......... 318 table 8-24. tse gth indirect registers .............................................................................................. ...... 318 table 8-25. address map for pscb0-2 registers in gth ......................................................................... 319 table 8-26. general tse instructions ............................................................................................... ......... 320 table 8-27. fm tree search input operands ............................................................................................ .321 table 8-28. fm tree search results (tsr) output ................................................................................... 322 table 8-29. lpm tree search input operands .......................................................................................... 3 22 table 8-30. lpm tree search results output ........................................................................................... 323 table 8-31. smt tree search input operands .......................................................................................... 3 24 table 8-32. smt tree search results output ........................................................................................... 324 table 8-33. memory read input operands .............................................................................................. .. 325 table 8-34. memory read output results .............................................................................................. ... 326 table 8-35. memory write input operands ............................................................................................. ... 326 table 8-36. hash key input operands ................................................................................................. ...... 327 table 8-37. hash key output results ................................................................................................. ....... 327 table 8-38. rdludef input operands ................................................................................................. ..... 328 table 8-39. rdludef output results ................................................................................................. ...... 329 table 8-40. compend input operands ................................................................................................. ... 329 table 8-41. compend output results ................................................................................................. .... 330 table 8-42. distpos_gdh input operands ............................................................................................. 331 table 8-43. distpos_gdh output results ............................................................................................. .332 table 8-44. rdpscb_gdh input operands .............................................................................................. 333 table 8-45. rdpscb_gdh output results .............................................................................................. .333 table 8-46. wrpscb_gdh input operands ............................................................................................. 3 34 table 8-47. wrpscb_gdh output results .............................................................................................. 334
ibm powernp np4gs3 network processor preliminary page 20 of 554 np3_dllot.fm.08 may 18, 2001 table 8-48. setpatbit_gdh input operands ........................................................................................... ...335 table 8-49. setpatbit_gdh output results ........................................................................................... ....335 table 8-50. general gth instructions ............................................................................................... .........336 table 8-51. hash key gth input operands .............................................................................................. .336 table 8-52. hash key gth output results .............................................................................................. ..337 table 8-53. rdludef_gth input operands ............................................................................................3 37 table 8-54. rdludef_gth output results ............................................................................................. 337 table 8-55. tsenqfl input operands ................................................................................................. .....338 table 8-56. tsenqfl output results ................................................................................................. ......338 table 8-57. tsdqfl input operands .................................................................................................. .......338 table 8-58. tsdqfl output results .................................................................................................. ........339 table 8-59. rclr input operands .................................................................................................... .........339 table 8-60. rclr output results .................................................................................................... ..........339 table 8-61. ardl input operands .................................................................................................... .........340 table 8-62. ardl output results .................................................................................................... ..........340 table 8-63. tlir input operands .................................................................................................... ...........341 table 8-64. tlir output results .................................................................................................... ............341 table 8-65. clrpscb input operands ................................................................................................. .....341 table 8-66. clrpscb output results ................................................................................................. ......341 table 8-67. rdpscb input operands .................................................................................................. ......342 table 8-68. rdpscb output results .................................................................................................. .......342 table 8-69. wrpscb input operands .................................................................................................. .....342 table 8-70. pushpscb input operands ................................................................................................ ...343 table 8-71. pushpscb output results ................................................................................................ ....343 table 8-72. distpos input operands ................................................................................................. ......343 table 8-73. distpos output results ................................................................................................. .......343 table 8-74. tsr0pat input operands ................................................................................................. ......344 table 8-75. tsr0pat output results ................................................................................................. .......344 table 8-76. pat2dta input operands ................................................................................................. ......344 table 8-77. pat2dta output results ................................................................................................. .......344 table 8-78. general hash functions ................................................................................................. .........345 table 9-1. field definitions for cab addresses ....................................................................................... ...361 table 10-1. plb master connections ................................................................................................. ........366 table 10-2. uic interrupt assignments .............................................................................................. .........367 table 10-3. np4gs3 pci device configuration header values ................................................................368 table 10-4. plb address map for pci/plb macro .....................................................................................368 table 10-5. plb address map ........................................................................................................ ............371 table 10-6. cab address map ........................................................................................................ ...........372 table 10-1. reset domains ......................................................................................................... ...............416
ibm powernp np4gs3 preliminary network processor np3_dllot.fm.08 may 18, 2001 page 21 of 554 table 11-1. reset and initialization sequence ....................................................................................... .... 421 table 11-2. set i/os checklist ..................................................................................................... ............... 423 table 11-3. setup 1 checklist ...................................................................................................... .............. 426 table 11-4. diagnostics 1 checklist ................................................................................................ ........... 427 table 11-5. setup 2 checklist ...................................................................................................... .............. 428 table 11-6. hardware initialization checklist ...................................................................................... ....... 429 table 11-7. diagnostic 2 checklist ................................................................................................. ............ 430 table 11-8. configure checklist ................................................................................................... .............. 432 table 14-1. absolute maximum ratings ............................................................................................... ...... 509 table 14-2. input capacitance (pf) ................................................................................................. ........... 509 table 14-3. operating supply voltages .............................................................................................. ........ 530 table 14-4. thermal characteristics ............................................................................................... ........... 530 table 14-5. definition of terms .................................................................................................... .............. 531 table 14-6. 1.8 v cmos driver dc voltage specifications ....................................................................... 531 table 14-7. 1.8 v cmos driver minimum dc currents at rated voltage ................................................. 531 table 14-8. 2.5 v cmos driver dc voltage specifications ....................................................................... 531 table 14-9. 2.5 v cmos driver minimum dc currents at rated voltage ................................................. 531 table 14-10. 3.3 v-tolerant 2.5 v cmos driver dc voltage specifications ............................................. 532 table 14-11. 3.3 v lvttl driver dc voltage specifications ..................................................................... 532 table 14-12. 3.3 v lvttl/5.0 v-tolerant driver dc voltage specifications ............................................. 532 table 14-13. 3.3 v lvttl driver minimum dc currents at rated voltage ............................................... 532 table 14-14. 1.8 v cmos receiver dc voltage specifications ................................................................ 533 table 14-15. 2.5 v cmos receiver dc voltage specifications ................................................................ 533 table 14-16. 3.3 v lvttl receiver dc voltage specifications ................................................................ 533 table 14-17. 3.3 v lvttl / 5 v tolerant receiver dc voltage specifications .......................................... 533 table 14-18. receiver maximum input leakage dc current input specifications .................................... 533 table 14-19. lvds receiver dc specifications ........................................................................................ 535 table 14-20. sstl2 dc specifications ............................................................................................... ....... 535 table 14-21. dasl receiver dc specifications idasl_a ......................................................................... 537 table 14-22. dasl driver dc specifications odasl_a ........................................................................... 537
ibm powernp np4gs3 network processor preliminary page 22 of 554 np3_dllot.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_preface.fm.08 may 18, 2001 about this book page 23 of 554 about this book this datasheet describes the ibm powernp np4gs3 and explains the basics of building a system using it. a terms and abbreviations list is provided in section 15. glossary of terms and abbreviations on page 539. who should read this manual this datasheet provides information for network hardware engineers and programmers using the np4gs3 to develop interconnect solutions for internet or enterprise network providers. it includes an overview of data flow through the device and descriptions of each functional block. in addition, it provides electrical, physical, thermal, and configuration information about the device. related publications ibm powerpc 405gp embedded processor user ? s manual ( http://www-3.ibm.com/chips/techlib/techlib.nsf/products/powerpc_405gp_embedded_processor ) pci specification, version 2.2 ( http://www.pcisig.com ) conventions used in this manual the following conventions are used in this manual. 1. the bit notation in the following sections is non-ibm, meaning that bit zero is the least significant bit and bit 31 is the most significant bit in a 4-byte word. section 2. physical description section 3. physical mac multiplexer section 4. ingress enqueuer / dequeuer / scheduler section5. switchinterface section 6. egress enqueuer / dequeuer / scheduler section 7. embedded processor complex section 9. serial / parallel manager interface 2. the bit notation in section8. treesearchengine and section 10. embedded powerpc ? subsystem is ibm-standard, meaning that bit 31 is the least significant bit and bit zero is the most significant bit in a 4-byte word. 3. nibble numbering is the same as byte numbering. the left-most nibble is most significant and starts at zero. 4. all counters wrap back to zero when they exceed their maximum values. exceptions to this rule are noted in the counter definitions. 5. overbars (txenb , for example) designate signals that are asserted ? low. ?
ibm powernp np4gs3 network processor preliminary about this book page 24 of 554 np3_dl_preface.fm.08 may 18, 2001 6. numericnotationisasfollows: hexadecimal values are preceded by x or x. for example: x ? 0b00 ? . binary values in text are either spelled out (zero and one) or appear in quotation marks. for example: ? 10101 ? . binary values in the default and description columns of the register sections are often isolated from text as in this example: 0: no action on read access 1: auto-reset interrupt request register upon read access 7. field length conventions are as follows: 1byte=8bits 1word=4bytes 1 double word (dw) = 2 words = 8 bytes 1 quadword (qw) = 4 words = 16 bytes 8. for signal and field definitions, when a field is designated as ? reserved ? : it must be sent as zero as an input into the np4gs3, either as a signal i/o or a value in a reserved field of a control block used as input to a picocode process. it must not be checked or modified as an output from the np4gs3, either as a signal i/o or a value in a reserved field of a control block used as input to an external code process. its use as code point results in unpredictable behavior.
np3_dl_sec01_over.fm.08 may 18, 2001 ibm powernp np4gs3 preliminary network processor general information page 25 of 554 1. general information 1.1 features 4.5 million packets per second (mpps) layer 2 and layer 3 switching. eight dyadic protocol processor units (dppus) - two picocode processors per dppu - nine shared coprocessors per dppu - four threads per dppu - zero context switching overhead between threads embedded powerpc processor and external 33/66 mhz 32-bit pci bus for enhanced design flexibility. - supports riscwatch through the jtag interface ten medium access controls (macs) - up to four gigabit ethernet or 40 fast ether- net ports, accessed through serial media- independent (smii), gigabit media-indepen- dent (gmii), and ten-bit (tbi) interfaces, that support industry standard physical layer devices - 36 ethernet statistics counters per mac - up to one million software-defined, hard- ware-assisted counters, enabling support of many standard management information bases (mibs) at wire speed - 16 oc-3c, 4 oc-12, 4 oc-12c, 1 oc-48, or 1 oc-48c integrated packet over sonet (pos) interfaces that support industry stan- dard pos framers - hardware vlan support (detection, tag insertion and deletion). two data-aligned synchronous link (dasl) ports - attach the device to an ibm packet routing switch, another np4gs3, or to itself. - rated 3.25 to 4 gigabits per second (gbps) - compliant with the eia/jedec jesd8-6 standard for differential hstl addressing capability of 64 target network pro- cessors, enabling the design of network systems with up to 1024 ports. advanced flow control mechanisms that tolerate high rates of temporary oversubscription without tcp collapse. fast lookups and powerful search engines based on geometric hash functions that yield lower collision rates than conventional bit- scrambling methods. hardware support for port mirroring 1 . mirrored traffic can share bandwidth with user traffic or use a separate switch data path, eliminating the normal penalty for port mirroring. support for jumbo frames (9018 without vlan, 9022 with vlan). hardware-managed, software-configured band- width allocation control of 2047 concurrent com- munication flows. serial management interface to support physical layer devices, board, and box functions ibm sa-27e, 0.18 m technology. voltage ratings: - 1.8 v supply voltage - 2.5 v and 3.3 v compatibility with drivers and receivers - 1.25vreferencevoltageforsstldrivers - 1.5 v compatibility for dasl interfaces 1088-pin bottom surface metallurgy - ceramic column grid array (bsm-ccga) package with 815 signal i/o. ieee 1149.1a jtag compliant. 1. oc-48c ports are not supported by port mirroring functions. .
ibm powernp np4gs3 network processor preliminary general information page 26 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 1.2 ordering information part number description IBM32NPR161EPXCAD133 ibm powernp np4gs3a (r1.1) ibm32npr161epxcae133 ibm powernp np4gs3b (r2.0)
ibm powernp np4gs3 preliminary network processor np3_dl_sec01_over.fm.08 may 18, 2001 general information page 27 of 554 1.3 overview the ibm powernp ? np4gs3 network processor enables network hardware designers to create fast, powerful, and scalable systems. the np4gs3 contains an embedded processor complex (epc) in which processors and coprocessors work with hardware accelerators to increase processing speed and power. additional features, such as integrated search engines, variable packet length schedulers, and support for qos functions, support the needs of customers who require high function, high capacity, media-rate switching. the np4gs3 is also highly scalable, capable of supporting systems with up to 1024 ports. the epc is the heart of the np4gs3, evaluating, defining, and processing data. it maximizes the speed and processing power of the device and provides it with functionality above that of an independent switching device. within the epc, eight dyadic protocol processor units (dppus) combine picocode processors, copro- cessors, and hardware accelerators to support functions such as high-speed pattern search, data manipula- tion, internal chip management, frame parsing, and data prefetching. the np4gs3 provides fast switching by integrating switching engine, search engine, and security functions on one device. it supports layer 2 and 3 ethernet frame switching, and includes three switch priority levels for port mirroring, high priority user frames, and low priority frames. it supports ethernet, packet over sonet (pos), and point-to-point protocol (ppp) protocols. because of the device ? s ability to enforce hundreds of rules with complex range and action specifications, np4gs3-based systems are uniquely suited for server clusters. np4gs3-based systems can range from a desktop system with a single device to multi-rack systems with up to 64 network processors. scaling of this nature is accomplished through the use of ibm's high performance, non-blocking, packet switching technology and ibm's data-aligned synchronous link (dasl) interface, which can be adapted to other switch technologies. using the np4gs3 with the ibm packet routing switch enables addressing of up to 640 ports. third-party switches can address up to 1024 ports. systems developed with the np4gs3 use a distributed software model. to support this model, the device hardware and code development suite include on-chip debugger facilities, a picocode assembler, and a picocode and system simulator, all of which can decrease the time to market for new applications. in order to take advantage of these features, a designer must know how the device works and how it fits into a system. the following sections discuss the basic placement of the device within a system, its major func- tional blocks, and the movement of data through it. the chapters following this overview explore all of these issues in detail. 1.4 np4gs3-based systems the np4gs3 is scalable, enabling multiple system configurations: low-end systems with a single device that uses its own switch interface (swi) to wrap traffic from the ingress to the egress side medium-end systems with two devices that are directly connected through their swis high-end systems with up to 64 np4gs3s that are connected through a single or redundant switch; a sin- gle ibm packet routing switch can address up to 16 np4gs3s. high-end systems can address up to 256 gigabit ethernet or 1024 fast ethernet or pos (oc-3) ports.
ibm powernp np4gs3 network processor preliminary general information page 28 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 systems developed with the np4gs3 use a distributed software model, which relies on a control point to execute software instructions. in a high-end system, the control point may be an external microprocessor connected through an ethernet link or the pci interface. the np4gs3 ? s embedded powerpc processor can perform control point functions in a smaller system. in this model, functions are divided between the control point and the network processor, as illustrated in figure 1-1 . the control point supports layer 2 and layer 3 routing protocols, layer 4 and layer 5 network applications, box maintenance, management information base (mib) collection (that is, the control point func- tions as an snmp agent), and other systems management functions. other functions, such as forwarding, filtering, and classification of the tables generated by the routing protocols, are performed by the dyadic protocol processor units (dppus) in each network processor in the system. the core language processors (clps) in each dppu execute the epc ? s core software instruction set, which includes conditional execution, conditional branching, signed and unsigned operations, counts of leading zeros, and more. figure 1-1. function placement in an np4gs3-based system packet routing switch ibm powernp cp 3.25 to 4 gbps control point ibm powernp control store l2 support (spanning tree...) box services l2 forwarding filtering learning l3 forwarding filtering l4 flow classification priority shaping network management counters l3 support (ospf...) frame repository queueing flow control frame alteration multicast handling data store data store data store control store control store fdx ibm powernp ibm powernp l4 and l5 network applications networking management agent (rmon...)
ibm powernp np4gs3 preliminary network processor np3_dl_sec01_over.fm.08 may 18, 2001 general information page 29 of 554 1.5 structure the ibm powernp np4gs3 network processor has eight major functional blocks, as illustrated in figure 1-2 : epc provides all processing functions for the device. embedded powerpc can act as a control point for the device; the control store interface provides up to 128 mb of program space for the powerpc. ingress enqueuer / dequeuer / scheduler (ingress eds) provides logic for frames traveling from the physical layer devices to the switch fabric. egress enqueuer / dequeuer / scheduler (egress eds) provides logic for frames traveling from the switch fabric to the physical layer devices. ingress switch interface (ingress swi) transfers frames from the ingress eds to a switch fabric or another np4gs3. egress switch interface (egress swi) transfers frames from a switch fabric or another np4gs3 to the egress eds. ingress physical mac multiplexer (ingress pmm) receives frames from physical layer devices. egress physical mac multiplexer (egress pmm) transmits frames to physical layer devices.
ibm powernp np4gs3 network processor preliminary general information page 30 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 1.5.1 epc structure the epc contains eight dyadic protocol processor units (dppus). each dppu contains two core language processors (clps) that share nine coprocessors, one coprocessor command bus, and a memory pool. the eight dppus share 32 threads, four of which are enhanced, and three hardware accelerators. together, the eight dppus are capable of operating on up to 32 frames in parallel. they share 16 k words (32 k words in np4gs3b (r2.0)) of internal picocode instruction store, providing 2128 million instructions per second (mips) of processing power. in addition, the epc contains a hardware classifier to parse frames on the fly, preparing them for processing by the picocode. figure 1-2. np4gs3 major functional blocks embedded processor complex ingress eds egress eds enqueuer dequeuer scheduler enqueuer dequeuer data store ingress switch interface egress switch interface ddr sdram (10 to 13) zbt sram (2) internal srams data store ddr sdram ingress pmm multiplexed macs egress pmm multiplexed macs a b c d e physical layer devices scheduler dmu bus dmu bus embedded 405 power pc spm interface
ibm powernp np4gs3 preliminary network processor np3_dl_sec01_over.fm.08 may 18, 2001 general information page 31 of 554 1.5.1.1 coprocessors each dppu contains two picocode processors, the clps, that execute the epc ? s core instruction set and control thread swapping and instruction fetching. the two clps share eight dedicated coprocessors that can runinparallelwiththeclps: 1.5.1.2 enhanced threads each clp can run two threads, making four threads per dppu, or 32 total. twenty-eight of the threads are general data handlers (gdhs), used for forwarding frames, and four of the 32 threads are enhanced: checksum calculates and verifies frame header checksums. cab interface controls thread access to the control access bus (cab) through the cab arbiter; the cab control, cab arbiter, and cab interface enable debug access to np4gs3 data structures. counter updates counters for the picocode engines. data store interfaces frame buffer memory (ingress and egress directions), providing a 320-byte working area. provides access to the ingress and egress data stores. enqueue manages control blocks containing key frame parameters; works with the comple- tion unit hardware accelerator to enqueue frames to the switch and target port output queues. policy determines if the incoming data stream complies with configured profiles. string copy accelerates data movement between coprocessors within the shared memory pool. tree search engine performs pattern analysis through tree searches (based on algorithms provided by the picocode) and read and write accesses, all protected by memory range checking; accesses control store memory independently. semaphore manager assists in controlling access to shared resources, such as tables and control struc- tures, through the use of semaphores; grants semaphores either in dispatch order (ordered semaphores) or in request order (unordered semaphores). guided frame handler (gfh) handles guided frames, the in-band control mechanism between the epc and all devices in the system, including the control point. general table handler (gth) builds table data in control memory general powerpc handler request (gph-req) processes frames bound to the embedded powerpc. general powerpc handler response (gph-resp) processes responses from the embedded powerpc.
ibm powernp np4gs3 network processor preliminary general information page 32 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 1.5.1.3 hardware accelerators the dppus share three hardware accelerators: 1.5.2np4gs3memory storage for the np4gs3 is provided by both internal and external memories (see figure 1-2 on page 30). the control store contains all tables, counters, and any other data needed by the picocode. the data stores contain the frame data to be forwarded and can be used by the picocode (via the data store coprocessor) to create guided traffic. the np4gs3 has the following stores: a common instruction memory that holds 16 k instruction words (32 k in np4gs3b (r2.0)) for normal processing and control functions 128 kb internal sram for input frame buffering 113 kb internal sram control store high capacity external ddr sdram for egress frame buffering and large forwarding tables; the amount of memory can vary depending on the configuration. external zbt sram for fast table access - upto512kb 36 in the z0 interface - upto123kb 18 in the z1 interface (for use by the scheduler) completion unit assures frame order as data exits the threads. dispatch unit fetches data and parses the work out among the dppus. control store arbiter enables the processors to share access to the control store.
ibm powernp np4gs3 preliminary network processor np3_dl_sec01_over.fm.08 may 18, 2001 general information page 33 of 554 1.6 data flow 1.6.1 basic data flow too many data flow routes and possibilities exist to fully document in this overview. however, data generally moves through the np4gs3 in the following manner (see figure 1-3 ): 1. the ingress pmm receives a frame from a physical layer device and forwards it to the ingress eds. 2. the ingress eds identifies the frame and enqueues it to the epc. 3. the epc processes the frame data (see section 1.6.2 ). the epc may discard the frame or modify the frame data directly and then return the updated data to the ingress eds ? sdatastore. 4. the frame is enqueued to the ingress eds, and the ingress eds scheduler selects the frame for trans- mission and moves the data to the swi. 5. the swi forwards the frame to a switch fabric, to another np4gs3, or to the egress swi of this device. 6. the egress swi receives a frame from a switch fabric, another np4gs3, or from the ingress swi. 7. the swi forwards the frame to the egress eds, which reassembles the frame and enqueues it to the epc once it is fully reassembled. figure 1-3. data flow overview embedded processor complex ingress eds ingress swi egress swi ingress pmm egress pmm physical layer devices 1 2 3 4 5 6 7 8 9 10 switch fabric egress eds
ibm powernp np4gs3 network processor preliminary general information page 34 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 8. the epc processes it (see section 1.6.2 ). the epc may discard the frame or modify the frame directly and then return the data to the egress eds ? s data store. 9. the frame is enqueued to the egress eds, and the egress eds scheduler, if enabled, selects the frame for transmission and moves the data to the egress pmm. if the scheduler is not enabled, the epc may forward the frame to a target port queue, to a wrap port, or to the gfh or gph. 10. the egress pmm sends the frame to a physical layer device. 1.6.2 data flow in the epc figure 1-4. basic data flow in the epc hardware classifier control store ingress eds queue ingress data egress data store store interface data store coprocessor enqueue coprocessor tree search engine completion unit dispatch unit instruction memory 1st available dppu arbiter egress eds queue interface (external) to egress pmm to ingress swi ingress eds egress eds epc 2 9 8 7 5 4 3 11 12 14 15 16 17 18 1 6 10 13
ibm powernp np4gs3 preliminary network processor np3_dl_sec01_over.fm.08 may 18, 2001 general information page 35 of 554 the epc is the functional center of the device, and it plays a pivotal role in data flow. this section presents a basic overview of data flow in the epc. ingress side 1. the ingress eds enqueues a data frame to the epc. 2. the dispatch unit fetches a portion of a frame and sends it to the next available thread. 3. simultaneously, the hardware classifier (hc) determines the starting common instruction address (cia), parses different frame formats (for example: bridged, ip, and ipx), and forwards the results on to the thread. 4. the picocode examines the information from the hc and may examine the data further; it assembles search keys and launches the tree search engine (tse). 5. the tse performs table searches, using search algorithms based on the format of the downloaded tables. 6. the control store arbiter allocates control store memory bandwidth among the protocol processors. 7. frame data moves into the data store coprocessor ? s memory buffer. forwarding and frame alteration information is identified by the results of the search. the ingress eds can insert or overlay vlan tags on the frame (hardware-assisted frame alteration) or the picocode can allocate or remove buffers to allow alteration of the frame (flexible frame alter- ation). 8. the enqueue coprocessor builds the necessary information to enqueue the frame to the swi and pro- vides it to the completion unit (cu), which guarantees the frame order as the data moves from the 32 threads of the dppu to the ingress eds queues. 9. the frame is enqueued to the ingress eds. the ingress eds forwards the frame to the ingress scheduler. the scheduler selects the frame for transmission to the ingress swi. note: the entire frame is not sent at once. the scheduler sends it a cell at a time. with the help of the ingress eds, the ingress switch data mover (i-sdm) (see section 4 beginning on page 133) segments the frames from the switch interface queues into 64-byte cells and inserts cell header and frame header bytes as they are transmitted to the swi. egress side 10. the egress eds enqueues a data frame to the epc. 11. the dispatch unit fetches a portion of a frame and sends it to the next available thread. 12. simultaneously, the hc determines the starting cia, parses different frame formats (for example: bridged, ip, and ipx), and forwards the results to the thread. 13. the picocode examines the information from the hc and may examine the data further; it assembles search keys and launches the tse. 14. the tse performs table searches, using search algorithms based on the format of the downloaded tables. 15. the control store arbiter allocates control store memory bandwidth among the protocol processors. 16. frame data moves into the data store coprocessor ? s memory buffer. forwarding and frame alteration information is identified by the results of the search. the np4gs3 provides two frame alteration techniques: hardware-assisted frame alteration and flexi- ble frame alteration:
ibm powernp np4gs3 network processor preliminary general information page 36 of 554 np3_dl_sec01_over.fm.08 may 18, 2001 in hardware-assisted frame alteration, commands are passed to the egress eds hardware during enqueueing. these commands can, for example, update the ttl field in an ip header, generate frame crc, or overlay an existing layer 2 wrapper with a new one. in flexible frame alteration, the picocode allocates additional buffers and the data store coprocessor places data into these buffers. the additional buffers allow prepending of data to a received frame and bypassing part of the received data when transmitting. this is useful for frame fragmentation when a when an ip header and mac header must be prepended to received data in order to form a frame fragment of the correct size. 17. the enqueue coprocessor builds the necessary information to enqueue the frame to the egress eds and provides it to the cu, which guarantees the frame order as the data moves from the 32 threads of the dppu to the egress eds queues. 18. the frame is enqueued to the egress eds. the frame is enqueued to the egress eds, which forwards it to the egress scheduler (if enabled). the scheduler selects the frame for transmission to a target port queue. if the scheduler is not enabled, the eds will forward the frame directly to a target queue. the egress eds selects frames for transmission from the target port queue and moves their data to the egress pmm.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 37 of 554 2. physical description figure 2-1. device interfaces data stores: sdram sram packet clk, test, reset, etc. b smll gmll tbi eeprom interface eeprom ibm powernp a smll gmll tbi pos c smll gmll tbi d pci interface switch a routing packet switch b routing note: the memory array consists of the following ddr sdrams: drams 0 and 4 2 devices each drams 1, 2, and 3 1deviceeach dram 6 6devices ddr sdram ddr control store control store ddr sdram for d0/1/2/3 interface control store ddr sdram for d4 interface control store ddr sdram for d6 interface pos pos pos smll gmll tbi np4gs3
ibm powernp np4gs3 network processor preliminary physical description page 38 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1 pin information this section describes the many interfaces and associated pins of the np4gs3 network processor. for a summary of all the device ? s interfaces and how many pins each contains, see table 2-1: signal pin functions on page 38. for information on signal pin locations, see table 2-33: complete signal pin listing by signal name on page 86 and table 2-34: complete signal pin listing by grid position on page 96. the following table groups the interfaces and pins by function, briefly describes them, and points to the loca- tion of specific information in the chapter. table 2-1. signal pin functions (page 1 of 2) pin type function resources ibm 28.4 gbps packet routing switch interface interface with the ibm packet routing switch table 2-2: ibm 28.4 gbps packet routing switch interface pins on page 39 flow control table 2-3: flow control pins on page 41 z0 and z1 zbt sram interface interface with the z0 and z1 zbt sram for lookups table2-4:z0zbtsraminterfacepins on page 42 table2-5:z1zbtsraminterfacepins on page 42 figure 2-2: zbt sram timing diagram on page 43 d3, d2, d1, and d0 memory interface with the ddr sdram used to implement the d3, d2, d1, and d0 memories table 2-9: d3, d2, and d1 interface pins on page 49 table 2-10: d0 memory pins on page 51 figure 2-3: ddr control timing diagram on page 45 figure 2-4: ddr read timing diagram on page 46 figure 2-5: ddr write output timing diagram on page 47 d4_0 and d4_1 memory interface with the ddr dram used to implement the d4 memories table 2-11: d4_0 and d4_1 interface pins on page 51 figure 2-3: ddr control timing diagram on page 45 figure 2-4: ddr read timing diagram on page 46 figure 2-5: ddr write output timing diagram on page 47 d6_5, d6_4, d6_3, d6_2, d6_1, and d6_0 memory interface with the ddr sdram used to implement the powerpc store table 2-12: d6_5, d6_4, d6_3, d6_2, d6_1, and d6_0 mem- ory pins on page 52 figure 2-3: ddr control timing diagram on page 45 figure 2-4: ddr read timing diagram on page 46 figure 2-5: ddr write output timing diagram on page 47 ds1 and ds0 memory interface with the ddr dram used to implement the ds1 and ds0 memories table 2-13: ds1 and ds0 interface pins on page 53 figure 2-3: ddr control timing diagram on page 45 figure 2-4: ddr read timing diagram on page 46 figure 2-5: ddr write output timing diagram on page 47
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 39 of 554 2.1.1 packet routing switch interface pins pmm interface interface with the physical layer devices through the following buses: table 2-14: pmm interface pins on page 54 table 2-15: pmm interface pin multiplexing on page 55 figure 2-6: np4gs3 dmu bus clock connections on page 56 tbi table 2-16: parallel data bit to 8b/10b position mapping (tbi interface) on page 57 table 2-17: pmm interface pins: tbi mode on page 58 figure 2-8: tbi timing diagram on page 59 gmii table 2-19: pmm interface pins: gmii mode on page 61 figure 2-9: gmii timing diagram on page 62 smii table 2-21: pmm interface pins: smii mode on page 63 figure 2-10: smii timing diagram on page 64 pos figure 2-7: np4gs3 dmu bus clock connections (pos overview) on page 57 table 2-23: pmm interface pins pos32 mode on page 66 table 2-24: pos signals on page 67 figure 2-11: pos transmit timing diagram on page 69 figure 2-12: pos receive timing diagram on page 70 pci interface interface to the pci bus table 2-26: pci pins on page 72 figure 2-13: pci timing diagram on page 74 management bus translated into various ? host ? buses by an external fpga (spm) table 2-28: management bus pins on page 75 figure 2-14: spm bus timing diagram on page 76 miscellaneous various interfaces table 2-30: miscellaneous pins on page 77 table 2-31: signals requiring pull-up or pull-down on page 79 table 2-2. ibm 28.4 gbps packet routing switch interface pins (page 1 of 2) signal (clock domain) description type dasl_out_a(7:0) (switch clk * 8) the positive half of an output bus of eight custom low power differential drivers. runs at frequency switch_clock_a * 8. output dasl 1.5 v dasl_out_a(7:0) (switch clk * 8) the negative half of the 8-bit differential bus described above. runs at frequency switch_clock_a * 8. output dasl 1.5 v dasl_in_a(7:0) (switch clk * 8) the positive half of an input bus of eight custom low power differential receivers. runs at frequency switch_clock_a * 8. input dasl 1.5 v dasl_in_a(7:0) (switch clk * 8) the negative half of the 8-bit differential bus described above. runs at frequency switch_clock_a * 8. input dasl 1.5 v dasl_out_b(7:0) (switch clk * 8) the positive half of an output bus of eight custom low power differential drivers. runs at frequency switch_clock_b * 8. output dasl 1.5 v table 2-1. signal pin functions (page 2 of 2) pin type function resources
ibm powernp np4gs3 network processor preliminary physical description page 40 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 dasl_out_b(7:0) (switch clk * 8) the negative half of the 8-bit differential bus described above. runs at frequency switch_clock_b * 8. output dasl 1.5 v dasl_in_b(7:0) (switch clk * 8) the positive half of an input bus of eight custom low power differential receivers. runs at frequency switch_clock_b * 8. input dasl 1.5 v dasl_in_b(7:0) (switch clk * 8) the negative half of the 8-bit differential bus described above. runs at frequency switch_clock_b * 8. input dasl 1.5 v master_grant_a(1:0) (switch clk * 2) master grant a indicates whether the ? a ? connection of the switch fabric is able to receive cells from the np4gs3. the definitions of these i/os are configured by the master grant mode configuration registers. see section 13.2 on page 445. input 5.0 v-tolerant 3.3 v lvttl 3.3 v master_grant_b(1:0) (switch clk * 2) master grant b indicates whether the ? b ? connection of the switch fabric is able to receive cells from the np4gs3. the definitions of these i/os are configured by the master grant mode configuration registers. see section 13.2 on page 445. input 5.0 v-tolerant 3.3 v lvttl 3.3 v multicast_grant_a(1:0) multicast grant a indicates whether the ? a ? connection of the switch fabric is able to receive multicast cells. bits (1:0) definition 00 no grants 01 priority 0 and 1 granted 10 priority 2 granted 11 priority 0, 1 and 2 granted bit 0 of this bus serves as the master grant for the high priority channel, and bit 1 for the low priority channel. this signal runs at frequency switch clock * 2. input 5.0 v-tolerant 3.3 v lvttl 3.3 v multicast_grant_b(1:0) multicast grant b indicates whether the ? b ? connection of the switch fabric is able to receive multicast cells from the np4gs3. bits (1:0) definition 00 no grants 01 priority 0 and 1 granted 10 priority 2 granted 11 priority 0, 1 and 2 granted bit 0 of this bus serves as the master grant for the high priority channel, and bit 1 for the low priority channel. this signal runs at frequency switch clock * 2. input 5.0 v-tolerant 3.3 v lvttl 3.3 v send_grant_a (switch clk * 2) send grant a indicates whether the ? a ? connection of the np4gs3 is able to receive cells from the switch fabric. 0 unable (the packet routing switch should send only idle cells) 1able the np4gs3 changes the state of this signal. output 5.0 v-tolerant 3.3 v lvttl 3.3 v send_grant_b (switch clk * 2) send grant b indicates whether the ? b ? connection of the np4gs3 is able to receive cells from the switch fabric. 0 unable (the packet routing switch should send only idle cells) 1able the np4gs3 changes the state of this signal. output 5.0 v-tolerant 3.3 v lvttl 3.3 v table 2-2. ibm 28.4 gbps packet routing switch interface pins (page 2 of 2) signal (clock domain) description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 41 of 554 2.1.2 flow control interface pins table 2-3. flow control pins signal description type i_freeq_th ingress free queue threshold 0 threshold not exceeded 1 threshold exceeded output 5.0 v-tolerant 3.3 v lvttl 3.3 v res_sync remote egress status synchronization (sync) is driven by the network processor that is con- figured to provide this signal. it is received by all other network processors. 1 shared data bus sync pulse. indicates start of time division multiplex cycle. input/output 5.0 v-tolerant 3.3 v lvttl 3.3 v res_data remote egress status data is driven by a single network processor during its designated time slot. 0 not exceeded 1 network processor ? s exponentially weighted moving average (ewma) of the egress offered rate exceeds the configured threshold. input/output 5.0 v-tolerant 3.3 v lvttl 3.3 v
ibm powernp np4gs3 network processor preliminary physical description page 42 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.3 zbt interface pins these pins interface with z0 and z1 zbt sram for lookups as described in table 2-4 and table 2-5 . table2-4.z0zbtsraminterfacepins signal description type lu_clk look-up clock. 7.5 ns period (133 mhz). output cmos 2.5 v lu_addr(18:0) look-up address signals are sampled by the rising edge of lu_clk. output cmos 2.5 v lu_data(35:0) look-up data. when used as sram inputs, the rising edge of lu_clk samples these sig- nals. input/output cmos 2.5 v lu_r_wrt look-up read/write control signal is sampled by the rising edge of lu_clk. 0write 1 read output cmos 2.5 v table2-5.z1zbtsraminterfacepins signal description type sch_clk sram clock input. 7.5 ns period (133 mhz). output cmos 2.5 v sch_addr(18:0) sram address signals are sampled by the rising edge of lu_clk. output cmos 2.5 v sch_data(17:0) data bus. when used as sram input, the rising edge of sch_clk samples these signals. input/output cmos 2.5 v sch_r_wrt read/write control signal is sampled by the rising edge of sch_clk. 0write 1 read output cmos 2.5 v
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 43 of 554 figure 2-2. zbt sram timing diagram 30 pf ts +v dd /2 50 t ck t ch t cl t da (max) t da (min) t dwe (min) t dwe (max) t dckon t dckoff t dh t ds t dd (min) outputs inputs xx_clk xx_addr xx_r_wrt xx_data t dd (max) notes: 1) xx = lu or sch 2) data invalid 3) v dd =2.5v 4) output load 50 ohms and 30 pf
ibm powernp np4gs3 network processor preliminary physical description page 44 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.4 ddr dram interface pins the pins described here interface with ddr dram to implement data store, control store, and the powerpc store. the control, read, and write timing diagrams ( figure 2-3 , figure 2-4 , and figure 2-5 )applytoallpin tables in this section. table 2-6. zbt sram timing diagram legend (for figure 2-2) symbol symbol description np4gs3a (r1.1) np4gs3 minimum (ns) maximum (ns) minimum (ns) maximum (ns) t ck zbt cycle time 7.5 7.5 t ch clock pulse width high 2.8 4.2 3.3 4.1 t cl clock pulse width low 3.3 4.7 3.4 4.2 t da address output delay 2.2 4.3 1.0 4.7 t dwe read/write output delay 2.9 5.3 1.1 2.6 t dd data output delay 1.7 5.0 0.7 2.7 t dckon data output turn on 1.0 3.2 1.3 4.1 t dckoff data output turn off 0.5 2.3 0.8 3.0 t ds input data setup time 1.5 1.0 t dh input data hold time 1.7 0 note: all delays are measured with 1 ns slew time measured from 10 - 90% of input voltage. note: column for np4gs3 is for all releases of the np other than r1.1.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 45 of 554 figure 2-3. ddr control timing diagram t da (max) t da (min) dx_addr t ch t cl dy_clk dy_nclk t dw (max) t dw (min) dx_we t dcs (max) t dcs (min) dx_cs t dba (max) t dba (min) dy_ba t dras (max) t dras (min) dy_ras t dcas (max) t dcas (min) dy_cas 30 pf ts +v dd /2 50 notes: 3) data invalid 4) v dd =2.5v 1) dx = d0,d1,d2,d3,d4,d6,ds0,ds1 2) dy = da,db,dc,dd,de 5) output load 50 ohms and 30 pf
ibm powernp np4gs3 network processor preliminary physical description page 46 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-4. ddr read timing diagram dx_dq t dqsq (max) t dqsq (max) t ck t ch t cl dy_clk dy_nclk dx_dqs t dqsck notes: 3) data invalid 4) v dd =2.5v 1) dx = d0,d1,d2,d3,d4,d6,ds0,ds1 2) dy = da,db,dc,dd,de 5) ouput load 50 ohms and 30 pf 30 pf ts +v dd /2 50 t dqsq (min) t dqsq (min)
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 47 of 554 figure 2-5. ddr write output timing diagram t ck t ch t cl dy_clk t dqsck t ds t dh dy_nclk dx_dqs dx_dq t dh t ds notes: 3) data invalid 4) v dd =2.5v 1) dx = d0,d1,d2,d3,d4,d6,ds0,ds1 2) dy = da,db,dc,dd,de 5) ouput load 50 ohms and 30 pf 30 pf ts +v dd /2 50
ibm powernp np4gs3 network processor preliminary physical description page 48 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 table 2-7. ddr timing diagram legend (for figure 2-3, figure 2-4, and figure 2-5) values are for d0, d1, d2, d3, d4, ds0 and ds1 symbol symbol description minimum (ns) maximum (ns) t ck ddr clock cycle time 7.5 t dqsck dy_clk to dx_dqs strobe delay 0.2 1.7 t ch clock pulse width high 0.45 * t ck 0.55 * t ck t cl clock pulse width low 0.45 * t ck 0.55 * t ck t dqsq dq data to dqs skew 0.7 0.8 t da address output delay 1.7 4.8 t dw write enable output delay 2.0 5.2 t dcs chip select output delay 1.9 5.2 t ba bank address output delay 1.9 5.2 t dras ras output delay 1.7 5.4 t dcas cas output delay 1.8 5.4 t ds data to strobe setup time 0.7 t dh data to strobe hold time 0.9 note: all delays are measured with 1 ns slew time measured from 10-90% of input voltage. all measurements made with test load of 50 ohms and 30 pf.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 49 of 554 2.1.4.1 d3, d2, d1, and d0 interface pins these pins interface with the ddr sdram used to implement the d3, d2, d1, and d0 control stores. table 2-8. ddr timing diagram legend (for figure 2-3, figure 2-4, and figure 2-5) values are for d6. symbol symbol description minimum (ns) maximum (ns) t ck ddr clock cycle time 7.5 t dqsck dy_clk to dx_dqs strobe delay 0.3 2.1 t ch clock pulse width high 0.45 * t ck 0.55 * t ck t cl clock pulse width low 0.45 * t ck 0.55 * t ck t dqsq dq data to dqs skew 0.4 0.8 t da address output delay 2.4 6.0 t dw write enable output delay 2.4 5.6 t dcs chip select output delay 2.5 5.9 t ba bank address output delay 2.3 5.7 t dras ras output delay 2.4 5.7 t dcas cas output delay 2.5 5.7 t ds data to strobe setup time 0.8 t dh data to strobe hold time 0.6 note: all delays are measured with 1 ns slew time measured from 10-90% of input voltage. all measurements made with test load of 50 ohms and 30 pf. table 2-9. d3, d2, and d1 interface pins (page 1 of 2) signal description type shared signals db_clk the positive pin of an output differential pair. 133 mhz. common to the d3, d2, and d1 memory devices. output sstl2 2.5 v db_clk the negative pin of an output differential pair. 133 mhz. common to the d3, d2, and d1 memory devices. output sstl2 2.5 v db_ras common row address strobe (common to d3, d2, and d1). output sstl2 2.5 v db_cas common column address strobe (common to d3, d2, and d1). output sstl2 2.5 v db_ba(1:0) common bank address (common to d3, d2, and d1). output sstl2 2.5 v d3 signals d3_addr(12:0) d3 address output cmos 2.5 v
ibm powernp np4gs3 network processor preliminary physical description page 50 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 d3_dqs(1:0) d3 data strobes input/output sstl2 2.5 v d3_data(15:0) d3 data bus input/output sstl2 2.5 v d3_we d3 write enable output cmos 2.5 v d3_cs d3 chip select output cmos 2.5 v d2 signals d2_addr(12:0) d2 address output cmos 2.5 v d2_dqs(1:0) d2 data strobes input/output sstl2 2.5 v d2_data(15:0) d2 data bus input/output sstl2 2.5 v d2_we d2 write enable output cmos 2.5 v d2_cs d2 chip select output cmos 2.5 v d1 signals d1_addr(12:0) d1 address output cmos 2.5 v d1_dqs(1:0) d1 data strobes input/output sstl2 2.5 v d1_data(15:0) d1 data bus input/output sstl2 2.5 v d1_we d1 write enable output cmos 2.5 v d1_cs d1 chip select output cmos 2.5 v table 2-9. d3, d2, and d1 interface pins (page 2 of 2) signal description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 51 of 554 2.1.4.2 d4_0 and d4_1 interface pins these pins interface with the ddr dram used to implement the d4 control store. table 2-10. d0 memory pins signal description type d0_0 and d0_1 shared signals de_clk the positive pin of an output differential pair. 133 mhz. common to the d0_0/1 memory devices. output sstl2 2.5 v de_clk the negative pin of an output differential pair. 133 mhz. common to the d0_0/1 devices. output sstl2 2.5 v de_ras common row address strobe output cmos 2.5 v de_cas common column address strobe output cmos 2.5 v de_ba(1:0) common bank address output cmos 2.5 v d0_addr(12:0) d0 address output cmos 2.5 v d0_dqs(3:0) d0 data strobes input/output sstl2 2.5 v d0_data(31:0) d0 data bus input/output sstl2 2.5 v d0_we d0 write enable output cmos 2.5 v d0_cs d0 chip select output cmos 2.5 v table 2-11. d4_0 and d4_1 interface pins (page 1 of 2) signal description type dd_clk the positive pin of an output differential pair. 133 mhz. common to the d4_0/1 memory devices. output sstl2 2.5 v dd_clk the negative pin of an output differential pair. 133 mhz. common to the d4_0/1 memory devices. output sstl2 2.5 v dd_ras common row address strobe output cmos 2.5 v
ibm powernp np4gs3 network processor preliminary physical description page 52 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.4.3 d6_ x interface pins these pins interface with the ddr sdram used to implement the powerpc store. dd_cas common column address strobe output cmos 2.5 v dd_ba(1:0) common bank address output cmos 2.5 v d4_addr(12:0) d4 address output cmos 2.5 v d4_dqs(3:0) d4 data strobes. when strobe_cntl is set to '01' only d4_dqs(0) is in use. (np4gs3b (r2.0) or later). input/output sstl2 2.5 v d4_data(31:0) d4 data bus input/output sstl2 2.5 v d4_we d4 write enable output cmos 2.5 v d4_cs d4 chip select output cmos 2.5 v table 2-12. d6_5, d6_4, d6_3, d6_2, d6_1, and d6_0 memory pins (page 1 of 2) signal description type da_clk the positive pin of an output differential pair. 133 mhz. common to the d6 memory devices. output sstl2 2.5 v da_clk the negative pin of an output differential pair. 133 mhz. common to the d6 memory devices. output sstl2 2.5 v da_ras common row address strobe (common to d6). output sstl2 2.5 v da_cas common column address strobe (common to d6). output sstl2 2.5 v da_ba(1:0) common bank address (common to d6). output sstl2 2.5 v d6_we common write enable (common to d6). output sstl2 2.5 v d6_addr(12:0) d6 address output sstl2 2.5 v table 2-11. d4_0 and d4_1 interface pins (page 2 of 2) signal description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 53 of 554 2.1.4.4 ds1 and ds0 interface pins these pins interface with the ddr dram used to implement the ds1 and dso data stores. d6_cs d6 chip select output sstl2 2.5 v d6_dqs(3:0) d6 data strobes. when d6 is configured for 16-bit interface mode (d6_dram_size = '0xx'), then only bits 0 and 2 are used (np4gs3b (r2.0)). input/output sstl2 2.5 v d6_data(15:0) d6 data bus input/output sstl2 2.5 v d6_byteen(1:0) d6 byte enables byte masking write to d6. input/output sstl2 2.5 v d6_parity(1:0) d6 parity signals, one per byte. must go to separate chips to allow for byte write capability. input/output sstl2 2.5 v d6_dqs_par(1:0) d6 data strobe for the parity signals input/output sstl2 2.5 v table 2-13. ds1 and ds0 interface pins signal description type shared signals dc_clk the positive pin of an output differential pair. 133 mhz. common to the ds1 and ds0 memory devices. output sstl2 2.5 v dc_clk the negative pin of an output differential pair. 133 mhz. common to the ds1 and ds0 memory devices. output sstl2 2.5 v dc_ras common row address strobe (common to ds1 and ds0). output sstl2 2.5 v dc_cas common column address strobe (common to ds1 and ds0). output sstl2 2.5 v dc_ba(1:0) common bank address (common to ds1 and ds0). output sstl2 2.5 v ds1 signals ds1_addr(12:0) ds1 address output cmos 2.5 v ds1_dqs(3:0) ds1 data strobes. when strobe_cntl is set to '01' only ds1_dqs(0) is in use. (np4gs3b (r2.0) or later). input/output sstl2 2.5 v table 2-12. d6_5, d6_4, d6_3, d6_2, d6_1, and d6_0 memory pins (page 2 of 2) signal description type
ibm powernp np4gs3 network processor preliminary physical description page 54 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.5 pmm interface pins these pins allow the physical mac multiplexer (pmm) to interface with the physical layer devices. the np4gs3 has different sets of pins for the ten-bit (tbi), gigabit media-independent (gmii), serial media- independent (smii), and packet over sonet (pos) interfaces. ds1_data(31:0) ds1 data bus input/output sstl2 2.5 v ds1_we ds1 write enable output cmos 2.5 v ds1_cs ds1 chip select output cmos 2.5 v ds0 signals ds0_addr(12:0) ds0 address output cmos 2.5 v ds0_dqs(3:0) ds0 data strobes. when strobe_cntl is set to '01' only ds0_dqs(0) is in use. (np4gs3b (r2.0) or later). input/output sstl2 2.5 v ds0_data(31:0) ds0 data bus input/output sstl2 2.5 v ds0_we ds0 write enable output cmos 2.5 v ds0_cs ds0 chip select output cmos 2.5 v table 2-14. pmm interface pins signal description type dmu_a(30:0) define the first of the four pmm interfaces and can be configured for tbi, smii, gmii, or pos. see table 2-15: pmm interface pin multiplexing on page 55 for pin directions and def- initions. 5.0 v-tolerant 3.3 v lvttl 3.3 v dmu_b(30:0) define the second of the four pmm interfaces and can be configured for tbi, smii, gmii, or pos. see table 2-15: pmm interface pin multiplexing on page 55 for pin directions and def- initions. 5.0 v-tolerant 3.3 v lvttl 3.3 v dmu_c(30:0) define the third of the four pmm interfaces and can be configured for tbi, smii, gmii, or pos. see table 2-15: pmm interface pin multiplexing on page 55 for pin directions and def- initions. 5.0 v-tolerant 3.3 v lvttl 3.3 v dmu_d(30:0) define the fourth of the four pmm interfaces and can be configured for tbi, smii, gmii, debug, or pos. see table 2-15: pmm interface pin multiplexing on page 55 for pin direc- tions and definitions. 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_lbyte(1:0) receive last byte position (valid for 32-bit pos only) provides the position of the last byte within the final word of the packet transfer. this signal is valid only when rx_eof is high. input 5.0 v-tolerant 3.3 v lvttl 3.3 v table 2-13. ds1 and ds0 interface pins signal description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 55 of 554 table 2-15. pmm interface pin multiplexing pin(s) pin mode interface type dmu_a, dmu_b, dmu_c dmu_d gmii tbi smii debug (dmu_d only) 8-bit pos 30 o o rxaddr(1) o 29 o o rxaddr(0) o 28 o o txaddr(1) o 27 o o txaddr(0) o 26 o o txsof o 25 o o tx_valid_byte o txeof o (24:17) o i/o tx_data(7:0) o tx_data(0:7) o tx_data(9:2) o debug(23:16) i/o txdata(7:0) o (16:9) i i/o rx_data(7:0) i rx_data(0:7) i rx_data(9:2) i debug(15:8) i/o rxdata(7:0) i 8oo tx_clk 8ns tx_clk 8ns ??? 7oi/o tx_en o tx_data(8) o tx_data(1) o debug(7) i/o txen o 6i/oi/o tx_er o tx_data(9) o tx_data(0) o debug(6) i/o txpfa i 5ii/o rx_valid_byte i rx_data(8) i rx_data(1) i debug(5) i/o rxpfa i 4ii/o tx_byte_credit i rx_data(9) i rx_data(0) i debug(4) i/o rxval i 3ii/o rx_clk i 8ns rx_clk1 i 16 ns clk i 8ns debug(3) o clk i 10 ns 2i/oi/o rx_dv i rx_clk0 i 16 ns sync o debug(2) i/o rxeof i 1i/oi/o rx_er i sig_det i sync2 o debug(1) i/o rxerr i 0i/oi/o cpdetect (0 = cpf) - input cpdetect (0 = cpf) - input activity - output cpdetect (0 = cpf) - input debug(0) i/o rxenb o
ibm powernp np4gs3 network processor preliminary physical description page 56 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-6. np4gs3 dmu bus clock connections gmii phy dmu_*(8) dmu_*(3) clock125 tx_clk rx_clk dmu_*(8) dmu_*(3) (1 port) note: trace lengths to all inputs will be matched on the card. nc ibm powernp tbi phy (10 ports) dmu_*(8) dmu_*(3) dmu_*(2) tx_clk rx_clk1 rx_clk0 clock125 53.3 mhz oscillator pll 266 mhz 62.5 mhz 62.5 mhz clock_core 125 mhz oscillator x5 notes: each figure above illustrates a single dmu bus and applies to any of the four dmu busses. the ? dmu_* ? labels represent any of the four dmu busses (dmu_a, dmu_b, dmu_c, or dmu_d). smii phy (6 ports) smii phy (4 ports) gmii interface smii interface tbi interface 125 mhz oscillator 125 mhz oscillator asynchronous interface ibm powernp ibm powernp
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 57 of 554 2.1.5.1 tbi bus pins figure 2-7. np4gs3 dmu bus clock connections (pos overview) table 2-16. parallel data bit to 8b/10b position mapping (tbi interface) parallel data bit 0123456789 8b/10b bit position abcde f gh i j dmu_a(8) dmu_a(3) note: trace lengths to all inputs will be matched on the card. 100 mhz oscillator rxclk txclk nc at m fr a m e r dmu_*(8) dmu_*(3) note: trace lengths to all inputs will be matched on the card. 100 mhz oscillator rxclk txclk nc (except oc-48) (oc-48) dmu_b(8) dmu_b(3) nc dmu_c(8) dmu_c(3) nc dmu_d(8) dmu_d(3) nc single dmu bus (applies to dmu a-d) at m fr a m e r ibm powernp ibm powernp
ibm powernp np4gs3 network processor preliminary physical description page 58 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 table 2-17. pmm interface pins: tbi mode signal description type tx_data(9:0) transmit data. data bus to the phy, synchronous to tx_clk. output 5.0 v-tolerant 3.3 vlvttl 3.3 v rx_data(9:0) receive data. data bus from the phy, synchronous to rx_clk1 and rx_clk0. (data switches at double the frequency of rx_clk1 or rx_clk0.) input 5.0 v-tolerant 3.3 vlvttl 3.3 v rx_clk1 receive clock, 62.5 mhz. rx_data is valid on the rising edge of this clock. input 5.0 v-tolerant 3.3 vlvttl 3.3 v rx_clk0 receive clock, 62.5 mhz. this signal is 180 degrees out of phase with rx_clk1. rx_data is valid on the rising edge of this clock. input 5.0 v-tolerant 3.3 vlvttl 3.3 v sig_det signal detect. signal asserted by the phy to indicate that the physical media are valid. input 5.0 v-tolerant 3.3 vlvttl 3.3 v cpdetect the control point card drives this signal active low to indicate its presence. when a non- control point card is plugged in, or this device pin is not connected, this signal should be pulled to a ? 1 ? on the card. during operation, this signal is driven by the network processor to indicate the status of the interface. 0 tbi interface is not in the data pass state (link down) 1 tbi interface is in the data pass state (occurs when auto-negotiation is complete, or when idles are detected (if an is disabled)) pulse tbi interface is in a data pass state and is either receiving or transmitting. the line pulses once per frame transmitted or received at a maximum rate of 8hz. input/output 5.0 v-tolerant 3.3 vlvttl 3.3 v tx_clk 125 mhz clock transmit clock to the phy. during operation, the network processor drives this signal to indicate that a transmit or receive is in progress for this interface. output 5.0 v-tolerant 3.3 vlvttl 3.3 v note: see table 2-15: pmm interface pin multiplexing on page 55 for pin directions (i/o) and definitions.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 59 of 554 figure 2-8. tbi timing diagram 30 pf ts +v dd /2 50 w t xck t xch t xcl t rck t rch t rcl t rcl t rch tx_clk tx_data rx_clk0 rx_clk1 rx_data transmit timings receive timings t dd (max) t dd (min) t rdh t rds t rdh t rds rx_sig_det t rsh t rss t rsh t rss rx_data rx_sig_det t rck notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf t rss
ibm powernp np4gs3 network processor preliminary physical description page 60 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 table 2-18. tbi timing diagram legend (for figure 2-8) symbol symbol description minimum (ns) maximum (ns) t xck tx_clk transmit cycle time 8 t xch tx_clk pulse width high 3.5 4.0 t xcl tx_clk pulse width low 4.0 4.5 t dd tx_data_(7:0) output delay 3.2 4.7 t rck rx_clk0/rx_clk1 receive cycle time 16 t rch rx_clk0/rx_clk1 pulse width high 7 t rcl rx_clk0/rx_clk1 pulse width low 7 t rds rx_data_(9:0) setup time clk0 0.2 t rdh rx_data_(9:0) hold time clk0 0 t rss sig_det setup time clk0 0.8 t rsh sig_det hold time clk0 0.9 t rds rx_data_(9:0) setup time clk1 0.2 t rdh rx_data_(9:0) hold time clk1 0 t rss sig_det setup time clk1 1.0 t rsh sig_det hold time clk1 0.3 note: all delays are measured with 1 ns slew time measured from 10-90% of input voltage.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 61 of 554 2.1.5.2 gmii bus pins table 2-19. pmm interface pins: gmii mode signal description type tx_data(7:0) transmit data. data bus to the phy, synchronous to tx_clk. output 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_data(7:0) received data. data bus from the phy, synchronous to rx_clk. input 5.0 v-tolerant 3.3 v lvttl 3.3 v tx_en transmit data enabled to the phy, synchronous to tx_clk. 0 end of frame transmission 1 active frame transmission output 5.0 v-tolerant 3.3 v lvttl 3.3 v tx_er transmit error, synchronous to the tx_clk. 0 no error detected 1 informs the phy that mac detected an error output 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_valid_byte receive valid data, synchronous to the rx_clk. 0 data invalid 1 byte of data (from the phy) on rx_data is valid. for a standard gmii connection, this signal can be tied to ? 1 ? on the card. input 5.0 v-tolerant 3.3 v lvttl 3.3 v tx_byte_credit transmit next data value, asynchronous. 0 do not send next data byte 1 asserted. phy indicates that the next tx_data value may be sent. for a standard gmii connection, this signal can be tied to ? 1 ? on the card. input 5.0 v-tolerant 3.3 v lvttl 3.3 v tx_valid_byte transmit valid data, synchronous to tx_clock 0 data invalid 1 byte of data (from the network processor) on tx_data is valid. output 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_clk 125 mhz receive medium clock generated by the phy. input 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_dv receive data valid (from the phy), synchronous to rx_clk. 0 end of frame transmission. 1 active frame transmission. input 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_er receive error, synchronous to rx_clk. 0 no error detected 1 informs the mac that phy detected an error input 5.0 v-tolerant 3.3 v lvttl 3.3 v tx_clk 125 mhz transmit clock to the phy. during operation, the network processor drives this sig- nal to indicate that a transmit is in progress for this interface. output 5.0 v-tolerant 3.3 v lvttl 3.3 v cpdetect the control point card drives this signal active low to indicate its presence. when a non- control point card is plugged in, or this device pin is not connected, this signal should be pulledtoa ? 1 ? on the card. input 5.0 v-tolerant 3.3 v lvttl 3.3 v note: the np4gs3 supports gmii in full-duplex mode only. see table 2-15: pmm interface pin multiplexing on page 55 for pin directions (i/o) and definitions.
ibm powernp np4gs3 network processor preliminary physical description page 62 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-9. gmii timing diagram 30 pf ts +v dd /2 50 t ck t xch t xcl t dd (min) t dd (max) t den (min) t den (max) t der (min) t der (max) t dbv (min) t dvb (max) t rch t rcl t rdh t rds t rvh t rvs t reh t res t rdvh t rdvs t bch t bcs tx_clk tx_data tx_en tx_er tx_valid_byte rx_clk rx_data rx_valid_byte rx_er rx_dv tx_byte_credit transmit timings receive timings notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 63 of 554 2.1.5.3 smii bus pins table 2-20. gmii timing diagram legend (for figure 2-9) symbol symbol description minimum (ns) maximum (ns) t ck tx_clk cycle time 8 t xch transmit clock pulse width high 3.5 3.9 t xcl transmit clock pulse width low 4.1 4.5 t rch receive clock pulse width high 2.5 t rcl receive clock pulse width low 2.5 t dd tx_data output delay 3.7 4.6 t der tx_er output delay 3.7 4.6 t dvb tx_valid_byte output delay 3.7 4.4 t den tx_en output delay 3.2 4.7 t rds rx_data setup time 1.9 t rdh rx_data hold time 0 t rvs rx_valid_byte setup time 1.9 t rvh rx_valid_byte hold time 0 t res rx_er setup time 1.8 t reh rx_er hold time 0 t rdvs rx_dv setup time 1.9 t rdvh rx_dv hold time 0 t bcs tx_byte_credit setup time 1.9 t bch tx_byte_credit hold time 0 1. all delays are measured with 1 ns slew time measured from 10-90% of input voltage. table 2-21. pmm interface pins: smii mode signal description type tx_data(9:0) transmit data. data bus to the phy - contains ten streams of serial transmit data. each serial stream is connected to a unique port. synchronous to the common clock (clk). output 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_data(9:0) received data. data bus from the phy - contains ten streams of serial receive data. each serial stream is connected to a unique port. synchronous to the common clock (clk). input 5.0 v-tolerant 3.3 v lvttl 3.3 v
ibm powernp np4gs3 network processor preliminary physical description page 64 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 sync asserted for one tx_clk cycle once every ten tx_clk cycles. assertion indicates the begin- ning of a 10-bit segment on both tx_data and rx_data. output 5.0 v-tolerant 3.3 v lvttl 3.3 v sync2 logically identical to sync and provided for fanout purposes. output 5.0 v-tolerant 3.3 v lvttl 3.3 v cpdetect the control point card drives this signal active low to indicate its presence. when a non- control point card is plugged in, or this device pin is not connected, this signal should be pulled to a ? 1 ? on the card. input 5.0 v-tolerant 3.3 v lvttl 3.3 v figure 2-10. smii timing diagram table 2-21. pmm interface pins: smii mode signal description type 30 pf ts +v dd /2 50 t ck t ch t cl t dd (min) t dd (max) t ds (min) t ds (max) t ds2 (min) t ds2 (max) t rh t rs clk tx_data sync sync2 rx_data transmit timings receive timings notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 65 of 554 table 2-22. smii timing diagram legend (for figure 2-10) symbol symbol description minimum (ns) maximum (ns) t ck clk cycle time 8 t ch clk pulse width high 4 t cl clk pulse width low 4 t dd tx_data_(9:0) output delay 1.9 4.7 tds sync output delay 2.2 4.5 tds2 sync2 output delay 2.3 4.5 trs rx_data_(9:0) setup time 0.8 trh rx_data_(9:0) hold time 0 1. all delays are measured with 1 ns slew time measured from 10-90% of input voltage.
ibm powernp np4gs3 network processor preliminary physical description page 66 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.5.4 pos bus pins table 2-23. pmm interface pins pos32 mode pin(s) dmu_a dmu_b dmu_c dmu_d 28 txpadl(1) o 27 txpadl(0) o 26 txsof o 25 txeof o (24:17) txdata(31:24) o txdata(23:16) o txdata(15:8) o txdata(7:0) o (16:9) rxdata(31:24) i rxdata(23:16) i rxdata(15:8) i rxdata(7:0) i 8 ? 7 txen o 6 txpfa i 5 rxpfa i 4 rxval i 3 clk i 10 ns clk i 10 ns clk i 10 ns clk i 10 ns 2 rxeof i 1 rxerr i 0 rxenb o single pins (not associated with a dmu) pin rx_padl(1:0) 1 rxpadl(1) i 0 rxpadl(0) i
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 67 of 554 table 2-24. pos signals (page 1 of 2) signal description type rxaddr(1:0) receive address bus selects a particular port in the framer for a data transfer. valid on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v rxdata (7:0) 8-bit mode (31:0) 32-bit mode receive pos data bus carries the frame word that is read from the framer ? s fifo. rxdata transports the frame data in an 8-bit format. rxdata[7:0] and [31:0] are updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v clk pos clock provides timing for the pos framer interface. clk must cycle at a 100 mhz or lower instantaneous rate. 5.0 v-tolerant 3.3 v lvttl 3.3 v rxenb receive read enable controls read access from the framer ? s receive interface. the framer ? s addressed fifo is selected on the falling edge of rxenb. generated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v rxeof receive end-of-frame marks the last word of a frame in rxdata. updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v rxerr receive packet error indicates that the received packet contains an error and must be dis- carded. only asserted on the last word of a packet (when rxeof is also asserted). updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v rxval receive valid data output indicates the receive signals rxdata, rxeof, rxerr, and rx_lbyte are valid from the framer. updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v rx_lbyte(1:0) receive padding length indicates the number of padding bytes included in the last word of the packet transferred in rxdata. only used when the network processor is configured in 32-bit pos mode. updated on the rising edge of clk. rx_lbyte(1:0) (32-bit mode) 00 packet ends on rxdata(7:0) (rxdata = dddd) 01 packet ends on rxdata(15:8) (rxdata = dddp) 10 packet ends on rxdata(23:16) (rxdata = ddpp) 11 packet ends on rxdata(31:24) (rxdata = dppp) 5.0 v-tolerant 3.3 v lvttl 3.3 v rxpfa receive polled frame-available input indicates that the framers polled receive fifo con- tains data. updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v txdata (7:0) 8-bit mode (31:0) 32-bit mode transmit utopia data bus carries the frame word that is written to the framer ? s transmit fifo. considered valid and written to a framer ? s transmit fifo only when the transmit interface is selected by using txenb . sampled on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v txen transmit write enable controls write access to the transmit interface. a framer port is selected on the falling edge of txenb . sampled on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v txaddr(1:0) transmit address bus uses txenb to select a particular fifo within the framer for a data transfer. sampled on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v txsof transmit start-of-frame marks the first word of a frame in txdata. sampled on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v
ibm powernp np4gs3 network processor preliminary physical description page 68 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 txeof transmit end-of-frame marks the last word of a frame in txdata. sampled on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v txpadl (1:0) transmit padding length indicates the number of padding bytes included in the last word of the packet transferred in txdata. sampled on the rising edge of clk. when configured in 32-bit mode the last word may contain zero, one, two or three padding bytes and only txpadl[1:0] is used. in 8-bit mode txpadl[1:0] is not used. txpadl[1:0] (32-bit mode) 00 packet ends on txdata[7:0] (txdata = dddd) 01 packet ends on txdata[15:8] (txdata = dddp) 10 packet ends on txdata[23:16] (txdata = ddpp) 11 packet ends on txdata[31:24] (txdata = dppp) 5.0 v-tolerant 3.3 v lvttl 3.3 v txpfa transmit polled frame-available output indicates that the polled framer ? s transmit fifo has free available space and the np4gs3 can write data into the framer ? s fifo. updated on the rising edge of clk. 5.0 v-tolerant 3.3 v lvttl 3.3 v table 2-24. pos signals (page 2 of 2) signal description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 69 of 554 figure 2-11. pos transmit timing diagram 30 pf ts +v dd /2 50 t ck t ch t cl t dd (min) t dd (max) t dxa (min) t dxa (max) t dra (min) t dra (max) t dsof (min) t dsof (max) clk txdata txaddr rxaddr txsof t deof (min) t deof (max) t den (min) t den (max) t dpadl (min) t dpadl (max) t dren (min) t dren (max) txeof txen txpadl rxenb notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf
ibm powernp np4gs3 network processor preliminary physical description page 70 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-12. pos receive timing diagram 30 pf ts +v dd /2 50 notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf t ck t ch t cl t rxh t rxs t rpfh t rpfs t tpfh t tpfs t rvh t rvs t reofh t reofs clk rxdata rxpfa rxerr rxval rxeof t tpfh t tpfs txpfa t rpadh rxpadl t rpads
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 71 of 554 table 2-25. pos timing diagram legend (for figure 2-11 and figure 2-12) np4gs3b (r2.0) symbol symbol description minimum (ns) maximum (ns) t ck clk cycle time 10 t ch clk clock width high 2.1 t cl clk clock width low 2.1 t dd tx_data_(31:0) output delay 2.2 4.6 t dxa tx_addr_(1:0) output delay 2.2 4.6 t dra rx_addr_(1:0) output delay 2.1 4.3 t dsof txsof output delay 2.3 4.6 t deof txeof output delay 2.2 4.5 t den txen output delay 1.9 4.7 t dpadl txpadl_(1:0) output delay 2.2 4.6 t dren rxenb output delay 1.5 4.3 t rxs rx_data_(31:0) setup time 1.9 t rxh rx_data_(31:0) hold time 0 t rvs rxval setup time 1.9 t rvh rxval hold time 0 t rers rxerr setup time 1.8 t rerh rxerr hold time 0 t reofs rxeof setup time 1.9 t reofh rxeof hold time 0 t rpfs rxpfa setup time 1.9 t rpfh rxpfa hold time 0 t tpfs txpfa setup time 1.6 t tpfh txpfa hold time 0 t rpads rxpadl setup time 1.8 t rpadh rxpadl hold time 0 1. all delays are measured with 1 ns slew time measured from 10-90% of input voltage.
ibm powernp np4gs3 network processor preliminary physical description page 72 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.6 pci pins these pins interface with the pci bus. table 2-26. pci pins (page 1 of 2) signal description type pci_clk pci clock signal (see pci_speed field below) input pci (in)/ 3.3 v pci_ad(31:0) pci multiplexed address and data signals input/output pci (t/s) 3.3 v pci_cbe (3:0) pci command/byte enable signals input/output pci (t/s) 3.3 v pci_frame pci frame signal input/output pci (s/t/s) 3.3 v pci_irdy pci initiator (master) ready signal input/output pci (s/t/s) 3.3 v pci_trdy pci target (slave) ready signal input/output pci (s/t/s) 3.3 v pci_devsel pci device select signal input/output pci (s/t/s) 3.3 v pci_stop pci stop signal input/output pci (s/t/s) 3.3 v pci_request pci bus request signal output pci (t/s) 3.3 v pci_grant pci bus grant signal input pci (t/s) 3.3 v pci_idsel pci initialization device select signal input pci (in) 3.3 v pci_perr pci parity error signal input/output pci (s/t/s) 3.3 v pci_serr pci system error signal input/output pci (o/d) 3.3 v pci_inta pci level sensitive interrupt output pci (o/d) 3.3 v pci_par pci parity signal. covers all the data/address and the four command/be signals. input/output pci (t/s) 3.3 v note: pci i/os are all configured for multi-point operation.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 73 of 554 pci_speed pci speed. controls acceptable pci frequency asynchronous range by setting the plb/ pci clock ratio. 0 plb:pci mode 2:1. acceptable pci frequency asynchronous range is 34.5 mhz to 66.6 mhz 1 plb:pci mode 3:1. acceptable pci frequency asynchronous range is 23.5 mhz to 44.5 mhz input 3.3 v-tolerant 2.5 v 2.5 v pci_bus_nm_int external non-maskable interrupt - the active polarity of the interrupt is programmable by the powerpc. input pci 3.3 v pci_bus_m_int external maskable interrupt - the active polarity of the interrupt is programmable by the powerpc. input pci 3.3 v table 2-26. pci pins (page 2 of 2) signal description type note: pci i/os are all configured for multi-point operation.
ibm powernp np4gs3 network processor preliminary physical description page 74 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-13. pci timing diagram 30 pf ts 50 v dd /2 t ck t ch t cl t val t on t off outputs pci_clk t dh t ds inputs notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 75 of 554 2.1.7 management bus interface pins the signals from these pins are translated into various ? host ? buses by an external field-programmable gate array (fpga) serial/parallel manager (spm). table 2-27. pci timing diagram legend (for figure 2-13) symbol symbol description minimum (ns) maximum (ns) t ck pci cycle time 15 t ch clk clock width high 7.5 t cl clk clock width low 7.5 t val worst case output delay 2.0 4.8 t on pci bus turn on output delay 2 t off pci bus turn off output delay 14 t ds input setup time 2.4 t dh input hold time 0 note: all delays are measured with 1 ns slew time measured from 10-90% of input voltage. table 2-28. management bus pins signal description type mg_data serial data. supports address/control/data protocol. input/output 3.3 v-tolerant 2.5 v 2.5 v mg_clk 33.33 mhz clock output 3.3 v-tolerant 2.5 v 2.5 v mg_nintr rising-edge sensitive interrupt input input 3.3 v-tolerant 2.5 v 2.5 v
ibm powernp np4gs3 network processor preliminary physical description page 76 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 figure 2-14. spm bus timing diagram table 2-29. spm bus timing diagram legend (for figure 2-14) symbol symbol description minimum (ns) maximum (ns) t ck spm cycle time 30 t ch clock pulse width high 14.4 15.6 t cl clock pulse width low 14.4 15.6 t dd data output delay 6.9 7.9 t ds data setup time 5.1 t dh data hold time 0 note: mg_nintr is an asynchronous input and is not timed. all delays are measured with 1 ns slew time measured from 10-90% of input voltage. 30 pf ts 50 v dd /2 t ck t ch t cl t dd (max) outputs mg_clk t dh t ds inputs t dd (min) mg_data mg_data notes: 1) v dd =3.3v 2) data invalid 3) output load 50 ohms and 30 pf
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 77 of 554 2.1.8 miscellaneous pins table 2-30. miscellaneous pins (page1of2) signal description type switch_clock_a the positive pin of an input differential pair. 50.6875 to 62.5 mhz. generates packet routing switch clock domains. required to have cycle-to-cycle jitter 150 ps. duty cycle tolerance must be 10%. an on-chip differential terminator of 100 ohms is present between this pin and its complement pin. input lvds 1.5 v switch_clock_a the negative pin of an input differential pair. 50.6875 to 62.5 mhz. input lvds 1.5 v switch_clock_b the positive pin of an input differential pair. 50.6875 to 62.5 mhz. generates packet routing switch clock domains. required to have cycle-to-cycle jitter 150 ps. duty cycle tolerance must be 10%. an on-chip differential terminator of 100 ohms is present between this pin and its complement pin. input lvds 1.5 v switch_clock_b the negative pin of an input differential pair. 50.6875 to 62.5 mhz. input lvds 1.5 v switch_bna selects which of the two dasl ports (a or b) carries network traffic. 0 port a carries the network traffic 1 port b carries the network traffic input 5.0 v-tolerant 3.3 v lvttl 3.3 v core_clock 53.33 mhz oscillator - generates 266 /133 clock domains. required to have cycle-to-cycle jitter 150 ps. duty cycle tolerance must be 5%. input 5.0 v-tolerant 3.3 v lvttl 3.3 v clock125 125 mhz oscillator. required to have cycle-to-cycle jitter 60 ps. duty cycle tolerance must be 5%. this clock is required only when supporting tbi and gmii dmu bus modes. input 5.0 v-tolerant 3.3 v lvttl 3.3 v blade_reset reset np4gs3 - signal must be driven active low for a minimum of 1 s to ensure a proper reset of the np4gs3. all input clocks (switch_clock_a, switch_clock_a , switch_clock_b, switch_clock_b , core_clock, clock125 if in use, and pci_clk) must be running prior to the activation of this signal. input 5.0 v-tolerant 3.3 v lvttl 3.3 v operational np4gs3 operational - pin is driven active low when both the np4gs3 ingress and egress macros have completed their initialization. it remains active until a subsequent blade_reset is issued. output 5.0 v-tolerant 3.3 v lvttl 3.3 v testmode(1:0) 00 functional mode, including concurrent use of the jtag interface for riscwatch or cabwatch operations. 01 debug mode - debug mode must be indicated by the testmode i/o for debug bus (dmu_d) output to be gated from the probe. 10 jtag test mode 11 lssd test mode input cmos 1.8 v jtag_trst jtag test reset. for normal functional operation, this pin must be connected to the same card source that is connected to the blade_reset input. when the jtag interface is used for jtag test functions, this pin is controlled by the jtag interface logic on the card. input 5.0 v-tolerant 3.3 v lvttl 3.3 v jtag_tms jtag test mode select. for normal functional operation, this pin should be tied either low or high. input 5.0 v-tolerant 3.3 v lvttl 3.3 v jtag_tdo jtag test data out. for normal functional operation, this pin should be tied either low or high. output 5.0 v-tolerant 3.3 v lvttl 3.3 v
ibm powernp np4gs3 network processor preliminary physical description page 78 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 jtag_tdi jtag test data in. for normal functional operation, this pin should be tied either low or high. input 5.0 v-tolerant 3.3 v lvttl 3.3 v jtag_tck jtag test clock. for normal functional operation, this pin should be tied either low or high. input 5.0 v-tolerant 3.3 v lvttl 3.3 v plla_v dd pllb_v dd pllc_v dd these pins serve as the +1.8 volt supply for a critical noise-sensitive portion of the phase- locked loop (pll) circuits. one pin serves as the analog v dd for each pll circuit. to prevent noise on these pins from introducing phase jitter in the pll outputs, place filters at the board level to isolate these pins from the noisy digital v dd pins. place separate filters on each ana- log v dd pin to prevent noise from one pll being introduced into another. see section 2.1.9 pll filter circuit on page 80 for filter circuit details. input pll_v dd 1.8 v plla_gnd pllb_gnd pllc_gnd these pins serve as the ground connection for the critical noise portion of the phase lock loop (pll). one pin serves as the analog gnd for each pll circuit. each should be con- nected to the digital ground plane at the v dda node of the pll filter capacitor shown in fig- ure 2-15: pll filter circuit diagram on page 80. input pll_gnd 0.0 v thermal_in input pad of the thermal monitor (resistor). see 2.1.10 thermal i/o usage on page 80 for details on thermal monitor usage thermal thermal_out output pad of the thermal monitor (resistor) thermal vref1(2), vref2(8,7,6) voltage reference for sstl2 i/os for d1, d2, and d3 (approximately four pins per side of the device that contains sstl2 i/o) input vref 1.25 v vref1(1), vref2(5,4,3) voltage reference for sstl2 i/os for d4 and d6 (approximately four pins per side of the device that contains sstl2 i/o) input vref 1.25 v vref1(0), vref2(2,1,0) voltage reference for sstl2 i/os for ds0 and ds1 (approximately four pins per side of the device that contains sstl2 i/o) input vref 1.25 v boot_picocode determines location of network processor picocode load location. 0 load from spm 1 load from external source (typically power pc or pci bus) input 3.3 v-tolerant 2.5 v 2.5 v boot_ppc determines location of power pc code start location. 0startfromd6 1startfrompci input 3.3 v-tolerant 2.5 v 2.5 v spare_tst_rcvr(9:0) unused signals needed for manufacturing test. spare_tst_rcvr (9:5,1) should be tied to 0 on the card. spare_tst_rcvr (4:2,0) should be tied to 1 on the card. input cmos 1.8 v c405_debug_halt this signal, when asserted low, forces the embedded powerpc 405 processor to stop pro- cessing all instructions. for normal functional operation, this signal should be tied inactive high. input 5.0 v-tolerant 3.3 v lvttl 3.3 v pio(2:0) programmable i/o [np4gs3b (r2.0))] input/output cmos 2.5v table 2-30. miscellaneous pins (page2of2) signal description type
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 79 of 554 table 2-31. signals requiring pull-up or pull-down signal name function value signals requiring a dc connection that is the same value for all applications testmode(1:0) pull-down jtag_tdi pull-up jtag_tms pull-up jtag_tck pull-up c405_debug_halt pull-up spare_tst_rcvr (9:5, 1) pull-down spare_tst_rcvr (4:2, 0) pull-up signals requiring a dc connection that varies across different applications multicast_grant_a, multicast_grant_b pull-up if no system device drives this signal res_data pull-down if no other system device drives this signal pci_speed choose up or down based on system pci bus speed mg_nintr pull-down if no system device drives this signal mg_data pull-down when external spm module is attached boot_picocode choose up or down based on picocode load location boot_ppc choose up or down based on ppc code load location switch_bna pull-up if no system device drives this signal signals which have an ac connection, but also require pull-up or pull-down operational pull-up dmu_a(0), dmu_b(0), dmu_c(0), dmu_d(0) cpdetect if control point blade then pull-down, otherwise pull-up dmu_a(30:29), dmu_b(30:29), dmu_c(30:29), dmu_d(30:29) dmu in 8-bit pos mode. rxaddr (1:0) pull-down dmu_a(4), dmu_b(4), dmu_c(4), dmu_d(4) dmu in any pos mode. rxval pull-down d3_dqs(1:0), d2_dqs(1:0), d1_dqs(1:0) pull-down d0_dqs(3:0), d4_dqs(3:0), d6_dqs(3:0) pull-down ds0_dqs(3:0), ds1_dqs(3:0) pull-down
ibm powernp np4gs3 network processor preliminary physical description page 80 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.1.9 pll filter circuit v dda is the voltage supply pin to the analog circuits in the pll. noise on v dda causes phase jitter at the output of the pll. v dda is brought to a package pin to isolate it from the noisy internal digital v dd signal. if little noise is expected at the board level, then v dda canbeconnecteddirectlytothedigitalv dd plane. in most circumstances, however, it is prudent to place a filter circuit on the v dda as shown below. note: all wire lengths should be kept as short as possible to minimize coupling from other signals. the impedance of the ferrite bead should be much greater than that of the capacitor at frequencies where noise is expected. many applications have found that a resistor does a better job of reducing jitter than a ferrite bead does. the resistor should be kept to a value lower than 2 . experimentation is the best way to determine the optimal filter design for a specific application. note: one filter circuit may be used for plla and pllb, and a second filter circuit should be used for pllc. 2.1.10 thermal i/o usage the thermal monitor consists of a resistor connected between pins pada and padb. at 2 c this resistance is estimated at 1290 + 350 ohms. the published temperature coefficient of the resistance for this technology is 0.33% per c. to determine the actual temperature coefficient, see measurement calibration on page 81. note: there is an electrostatic discharge (esd) diode at pada and padb. figure 2-15. pll filter circuit diagram figure 2-16. thermal monitor gnd digital v dd (via at board) v dda (to pll) ferrite bead c=0.1 f thermal pada padb
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 81 of 554 2.1.10.1 temperature calculation thechiptemperaturecanbecalculatedfrom where: r measured = resistance measured between pada and padb at test temperature. r calibrated = resistance measured between pada and padb (v r /i dc ) at known temperature. t calibrated = known temperature used to measure r calibrated. 2.1.10.2 measurement calibration to use this thermal monitor accurately, it must first be calibrated. to calibrate, measure the voltage drop at two different known temperatures at the package while the device is dissipating little (less than 100 mw) or no power. apply i dc and wait for a fixed time t m , where t m = approximately 1 ms. keep t m short to minimize heating effects on the thermal monitor resistance. then measure v r . next, turn off i dc and change the package temperature. reapply i dc ,waitt m again and measure v r . the temperature coefficient is, , where: t = temperature change, c v r = voltage drop, v i dc = applied current, a t c (r measured -r calibrated ) t chip = 1 +t calibrated c tc vr idc t --------------------- - c ------ = padb pada v r measure voltage drop v supply =v dd maximum i dc = 200 a maximum thermal
ibm powernp np4gs3 network processor preliminary physical description page 82 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.2 clocking domains see np4gs3 dmu bus clock connections on page 56 and np4gs3 dmu bus clock connections (pos overview) on page 57 for related clock information. figure 2-17. clock generation and distribution pll packet routing switch 4.54-5 ns clock domain 9.09 ns x8 /4 /4 x16 /8 ibm 28.4g packet routing switch dasl-a switch fabric card a switch blade osc ibm 28.4g packet routing switch switch fabric card b osc switch_clk_a (55 mhz) switch_clk_b (55 mhz) dasl-b pll packet routing switch 4.54-5 ns clock domain 9.09 ns x8 53.3 mhz 125 mhz 2 2 divide by 2 7.5 ns ibm np4gs3 /4 /4 tx_clk rx_clk phy and pos devices 3.75 ns 8ns core, ddr, zbt clock_core clock domain smii,tbi,andgmii x16 /8 pll x8 /4 /3 x15 /8 clock domain frequency lock divide by 4 36.3 ns divide by 4 36.3 ns 10 ns clock125 - tbi, gmii dmu_*(03) - smii note: switch_clk_a and b frequencies are shown for illustration purposes. these clocks can range from 50.6875 to 62.5 mhz.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 83 of 554 figure 2-18. pins diagram wv u t r p nml kj h gf e d c b a 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 ad ac ab aa y 20 21 22 23 24 25 corner ae af ag ah aj ak al am an signal = 815 test i/o ground = 137 v dd =72 v dd2 (3.3 v) = 16 1 2 v dd3 (2.5 v) = 16 v dd4 (2.5 v) = 16 3 4 v dd5 (2.5 v) = 16 5 dc test i/o 1 4 1 4 1 4 1 4 1 4 1 1 3 4 1 4 1 4 5 4 1 4 1 1 5 4 1 1 1 4 3 1 5 1 1 4 4 1 1 3 1 4 1 5 4 3 1 3 1 1 5 1 1 5 1 3 1 3 5 1 5 1 1 1 3 1 1 1 1 2 1 2 1 2 1 2 1 2 1 2 5 1 1 2 1 2 1 1 1 2 1 2 3 1 1 2 5 2 1 1 3 1 5 2 1 1 2 3 1 5 2 5 1 2 1 3 1 1 1 1 5 1 5 1 3 3 1 1 1 1 5 1 3 1 3 1 26 27 28 29 30 31 32 33 note: for illustrative purposes only. viewed through top of package a01
ibm powernp np4gs3 network processor preliminary physical description page 84 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.3 mechanical specifications figure 2-19. mechanical diagram notes: 1. refer to table 2-32 on page 85 for mechanical dimensions. 2. mechanical drawing is not to scale. see your ibm representative for more information. 3. ibm square outline conforms to jedec mo-158. a a1 dla lidless dla chip aaa d e e1 d1 e ccc ccc b note: no i/o at a01 location y ddd m y eee m z z xy a a1 aaa z y x terminal a01 identifier top view bottom view z
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 85 of 554 table 2-32. mechanical specifications mechanical dimensions value 1 a(dla) min 2 6.23 max 3 6.83 a (lidless) min 2 4.23 max 3 4.83 a1 nom 2.21 b min 0.48 max 0.52 e basic 1.27 aaa 0.15 ccc 0.20 ddd 0.30 eee 0.10 d 42.50 d1 40.64 e 42.50 e1 40.64 m 4 33 x 33 n 5 1088 5 weight (g) tbd 1. all dimensions are in millimeters, except where noted. 2. minimum package thickness is calculated using the nominal thickness of all parts. the nominal thickness of an 8-layer package was used for the package thickness. 3. maximum package thickness is calculated using the nominal thickness of all parts. the nominal thickness of a 12-layer package was used for the package thickness. 4. m = the i/o matrix size. 5. n = the maximum number of i/os. the number of i/os shown in the table is the amount after depopulation. product with 1.27 mm pitch is depopulated by one i/o at the a01 corner of the array. 6. ibm square outline conforms to jedec mo-158.
ibm powernp np4gs3 network processor preliminary physical description page 86 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.4 signal pin lists note: all unused pins should be left unconnected on the card. table 2-33. complete signal pin listing by signal name (page 1 of 10) signal name grid position signal name grid position signal name grid position blade_reset e29 d0_data(16) aj07 d1_data(01) an11 boot_picocode k07 d0_data(17) an04 d1_data(02) al12 boot_ppc l04 d0_data(18) an07 d1_data(03) aj10 c405_debug_halt ab23 d0_data(19) al07 d1_data(04) ak09 clock125 b33 d0_data(20) af09 d1_data(05) y15 core_clock c33 d0_data(21) ag08 d1_data(06) aa15 d0_addr(00) al10 d0_data(22) ae10 d1_data(07) aj11 d0_addr(01) an10 d0_data(23) ad11 d1_data(08) ak11 d0_addr(02) aj09 d0_data(24) an08 d1_data(09) al13 d0_addr(03) an06 d0_data(25) al09 d1_data(10) an12 d0_addr(04) aa14 d0_data(26) al06 d1_data(11) ah11 d0_addr(05) ab15 d0_data(27) am05 d1_data(12) am13 d0_addr(06) am07 d0_data(28) ab13 d1_data(13) an13 d0_addr(07) al08 d0_data(29) ac15 d1_data(14) ah13 d0_addr(08) am11 d0_data(30) ak07 d1_data(15) ac16 d0_addr(09) al11 d0_data(31) aj08 d1_dqs(0) aj14 d0_addr(10) ac14 d0_dqs(0) ag09 d1_dqs(1) an16 d0_addr(11) ag10 d0_dqs(1) ac12 d1_we al16 d0_addr(12) af11 d0_dqs(2) ae11 d2_addr(00) aj22 d0_cs an09 d0_dqs(3) ah09 d2_addr(01) ah21 d0_data(00) aa12 d0_we am09 d2_addr(02) an21 d0_data(01) ad09 d1_addr(00) ab17 d2_addr(03) am21 d0_data(02) ag07 d1_addr(01) aj13 d2_addr(04) ah23 d0_data(03) ab11 d1_addr(02) ak13 d2_addr(05) ac20 d0_data(04) an02 d1_addr(03) am15 d2_addr(06) ad21 d0_data(05) am03 d1_addr(04) an14 d2_addr(07) ag23 d0_data(06) an01 d1_addr(05) ag12 d2_addr(08) an22 d0_data(07) an03 d1_addr(06) af13 d2_addr(09) al21 d0_data(08) al04 d1_addr(07) ae14 d2_addr(10) ak23 d0_data(09) ag03 d1_addr(08) an15 d2_addr(11) aj23 d0_data(10) ah07 d1_addr(09) al14 d2_addr(12) aa19 d0_data(11) ae09 d1_addr(10) w16 d2_cs al22 d0_data(12) al03 d1_addr(11) ah15 d2_data(00) an18 d0_data(13) ac11 d1_addr(12) aj15 d2_data(01) aj19 d0_data(14) aj06 d1_cs ad15 d2_data(02) w18 d0_data(15) ak05 d1_data(00) ae12 d2_data(03) aa18
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 87 of 554 d2_data(04) aj20 d3_data(10) ag16 d4_data(16) e14 d2_data(05) al20 d3_data(11) ah17 d4_data(17) c14 d2_data(06) an19 d3_data(12) ag17 d4_data(18) e09 d2_data(07) ae20 d3_data(13) ag15 d4_data(19) a15 d2_data(08) ae17 d3_data(14) af15 d4_data(20) l15 d2_data(09) ae21 d3_data(15) af19 d4_data(21) j14 d2_data(10) ag22 d3_dqs(0) ag20 d4_data(22) g12 d2_data(11) an20 d3_dqs(1) ac17 d4_data(23) h13 d2_data(12) am19 d3_we ad19 d4_data(24) a14 d2_data(13) ak21 d4_addr(00) c12 d4_data(25) b15 d2_data(14) aj21 d4_addr(01) a11 d4_data(26) d13 d2_data(15) y19 d4_addr(02) j12 d4_data(27) e13 d2_dqs(0) ab19 d4_addr(03) h11 d4_data(28) p15 d2_dqs(1) ak25 d4_addr(04) g10 d4_data(29) e12 d2_we aj24 d4_addr(05) l13 d4_data(30) f13 d3_addr(00) ag19 d4_addr(06) c11 d4_data(31) a13 d3_addr(01) aj16 d4_addr(07) b11 d4_dqs(0) d11 d3_addr(02) af17 d4_addr(08) d09 d4_dqs(1) e11 d3_addr(03) aj17 d4_addr(09) c08 d4_dqs(2) n15 d3_addr(04) ag18 d4_addr(10) m13 d4_dqs(3) m15 d3_addr(05) ae18 d4_addr(11) n14 d4_we f11 d3_addr(06) aa17 d4_addr(12) b07 d6_addr(00) al28 d3_addr(07) ac18 d4_cs e10 d6_addr(01) an26 d3_addr(08) ak19 d4_data(00) m17 d6_addr(02) ae24 d3_addr(09) al19 d4_data(01) l16 d6_addr(03) ag26 d3_addr(10) an17 d4_data(02) d15 d6_addr(04) af25 d3_addr(11) ak17 d4_data(03) c15 d6_addr(05) al27 d3_addr(12) ae19 d4_data(04) a17 d6_addr(06) an30 d3_cs al18 d4_data(05) d17 d6_addr(07) aj27 d3_data(00) ag13 d4_data(06) h09 d6_addr(08) ak29 d3_data(01) aa16 d4_data(07) g14 d6_addr(09) aj28 d3_data(02) ag14 d4_data(08) g13 d6_addr(10) al29 d3_data(03) ae15 d4_data(09) k15 d6_addr(11) ag21 d3_data(04) al17 d4_data(10) c16 d6_addr(12) ah27 d3_data(05) am17 d4_data(11) a16 d6_byteen(0) ag24 d3_data(06) al15 d4_data(12) e15 d6_byteen(1) af23 d3_data(07) ak15 d4_data(13) f15 d6_cs al31 d3_data(08) w17 d4_data(14) r17 d6_data(00) al23 d3_data(09) ae16 d4_data(15) n16 d6_data(01) am23 table 2-33. complete signal pin listing by signal name (page 2 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 network processor preliminary physical description page 88 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 d6_data(02) al26 dasl_in_b(0) e02 dasl_out_b(3) n03 d6_data(03) am27 dasl_in_b(0) d01 dasl_out_b(4) l05 d6_data(04) ab21 dasl_in_b(1) j02 dasl_out_b(4) l06 d6_data(05) an28 dasl_in_b(1) h01 dasl_out_b(5) k05 d6_data(06) an24 dasl_in_b(2) f03 dasl_out_b(5) l07 d6_data(07) al24 dasl_in_b(2) f01 dasl_out_b(6) j05 d6_data(08) ah25 dasl_in_b(3) g02 dasl_out_b(6) j04 d6_data(09) ae23 dasl_in_b(3) g04 dasl_out_b(7) h05 d6_data(10) ac21 dasl_in_b(4) j03 dasl_out_b(7) h03 d6_data(11) an25 dasl_in_b(4) j01 da_ba(0) an29 d6_data(12) aj26 dasl_in_b(5) k01 da_ba(1) aa20 d6_data(13) ak27 dasl_in_b(5) k03 da_cas an27 d6_data(14) ae22 dasl_in_b(6) l03 da_clk am25 d6_data(15) ac22 dasl_in_b(6) l02 da_clk al25 d6_dqs(0) al30 dasl_in_b(7) n04 da_ras ag25 d6_dqs(1) an31 dasl_in_b(7) m03 db_ba(0) ag11 d6_dqs(2) an33 dasl_out_a(0) y01 db_ba(1) ad13 d6_dqs(3) am31 dasl_out_a(0) w02 db_cas aj12 d6_dqs_par(00) an32 dasl_out_a(1) w03 db_clk ah19 d6_dqs_par(01) af21 dasl_out_a(1) aa01 db_clk aj18 d6_parity(00) am29 dasl_out_a(2) ab01 db_ras ae13 d6_parity(01) an23 dasl_out_a(2) aa02 dc_ba(0) d25 d6_we aj25 dasl_out_a(3) ac06 dc_ba(1) j17 dasl_in_a(0) ak01 dasl_out_a(3) ac05 dc_cas n19 dasl_in_a(0) aj02 dasl_out_a(4) ac07 dc_clk e23 dasl_in_a(1) af01 dasl_out_a(4) ad05 dc_clk d23 dasl_in_a(1) ae02 dasl_out_a(5) ae04 dc_ras c21 dasl_in_a(2) ah01 dasl_out_a(5) ae05 dd_ba(0) a12 dasl_in_a(2) ah03 dasl_out_a(6) af03 dd_ba(1) j13 dasl_in_a(3) ae01 dasl_out_a(6) af05 dd_cas g11 dasl_in_a(3) ae03 dasl_out_a(7) ag04 dd_clk c13 dasl_in_a(4) ad03 dasl_out_a(7) ag02 dd_clk b13 dasl_in_a(4) ad01 dasl_out_b(0) r02 de_ba(0) am01 dasl_in_a(5) ac02 dasl_out_b(0) p01 de_ba(1) ah05 dasl_in_a(5) ac03 dasl_out_b(1) n01 de_cas al02 dasl_in_a(6) ab03 dasl_out_b(1) r03 de_clk aj04 dasl_in_a(6) aa04 dasl_out_b(2) n02 de_clk aj05 dasl_in_a(7) aa03 dasl_out_b(2) m01 dd_ras k13 dasl_in_a(7) y03 dasl_out_b(3) p03 de_ras ak03 table 2-33. complete signal pin listing by signal name (page 3 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 89 of 554 dmu_a(00) v31 dmu_b(08) r28 dmu_c(16) m23 dmu_a(01) v29 dmu_b(09) r27 dmu_c(17) n33 dmu_a(02) v27 dmu_b(10) r26 dmu_c(18) n32 dmu_a(03) w33 dmu_b(11) r25 dmu_c(19) n31 dmu_a(04) w32 dmu_b(12) r24 dmu_c(20) n30 dmu_a(05) w31 dmu_b(13) r23 dmu_c(21) n29 dmu_a(06) w30 dmu_b(14) t33 dmu_c(22) n28 dmu_a(07) w29 dmu_b(15) t31 dmu_c(23) n27 dmu_a(08) w28 dmu_b(16) t29 dmu_c(24) n26 dmu_a(09) w27 dmu_b(17) t27 dmu_c(25) n25 dmu_a(10) w26 dmu_b(18) r22 dmu_c(26) n24 dmu_a(11) w25 dmu_b(19) t23 dmu_c(27) n23 dmu_a(12) w24 dmu_b(20) v25 dmu_c(28) p33 dmu_a(13) w23 dmu_b(21) u32 dmu_c(29) p31 dmu_a(14) y33 dmu_b(22) u31 dmu_c(30) p29 dmu_a(15) y31 dmu_b(23) u30 dmu_d(00) d33 dmu_a(16) y29 dmu_b(24) u29 dmu_d(01) d31 dmu_a(17) y27 dmu_b(25) u28 dmu_d(02) g28 dmu_a(18) y25 dmu_b(26) u27 dmu_d(03) j29 dmu_a(19) y23 dmu_b(27) u26 dmu_d(04) e30 dmu_a(20) aa33 dmu_b(28) u25 dmu_d(05) f33 dmu_a(21) aa32 dmu_b(29) u24 dmu_d(06) f31 dmu_a(22) aa31 dmu_b(30) v33 dmu_d(07) f29 dmu_a(23) aa30 dmu_c(00) l33 dmu_d(08) g32 dmu_a(24) aa29 dmu_c(01) l32 dmu_d(09) k25 dmu_a(25) aa28 dmu_c(02) l31 dmu_d(10) g30 dmu_a(26) aa27 dmu_c(03) l30 dmu_d(11) g29 dmu_a(27) aa26 dmu_c(04) l29 dmu_d(12) e32 dmu_a(28) aa25 dmu_c(05) l28 dmu_d(13) h33 dmu_a(29) ab33 dmu_c(06) l27 dmu_d(14) h31 dmu_a(30) ab31 dmu_c(07) l26 dmu_d(15) h29 dmu_b(00) p27 dmu_c(08) l25 dmu_d(16) h27 dmu_b(01) p25 dmu_c(09) l24 dmu_d(17) j33 dmu_b(02) p23 dmu_c(10) l23 dmu_d(18) j32 dmu_b(03) r33 dmu_c(11) m33 dmu_d(19) j31 dmu_b(04) r32 dmu_c(12) m31 dmu_d(20) j30 dmu_b(05) r31 dmu_c(13) m29 dmu_d(21) k27 dmu_b(06) r30 dmu_c(14) m27 dmu_d(22) j28 dmu_b(07) r29 dmu_c(15) m25 dmu_d(23) j27 table 2-33. complete signal pin listing by signal name (page 4 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 network processor preliminary physical description page 90 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 dmu_d(24) j26 ds0_data(18) j24 ds1_data(06) a21 dmu_d(25) j25 ds0_data(19) k23 ds1_data(07) f21 dmu_d(26) k33 ds0_data(20) a26 ds1_data(08) e22 dmu_d(27) k31 ds0_data(21) c25 ds1_data(09) l18 dmu_d(28) k29 ds0_data(22) c28 ds1_data(10) l17 dmu_d(29) e31 ds0_data(23) b29 ds1_data(11) e21 dmu_d(30) g31 ds0_data(24) m21 ds1_data(12) d21 ds0_addr(00) a28 ds0_data(25) l19 ds1_data(13) b19 ds0_addr(01) n20 ds0_data(26) d27 ds1_data(14) a20 ds0_addr(02) m19 ds0_data(27) e26 ds1_data(15) g22 ds0_addr(03) b27 ds0_data(28) a25 ds1_data(16) j21 ds0_addr(04) c26 ds0_data(29) b25 ds1_data(17) j15 ds0_addr(05) b23 ds0_data(30) g25 ds1_data(18) j20 ds0_addr(06) c23 ds0_data(31) l22 ds1_data(19) a19 ds0_addr(07) g24 ds0_dqs(0) f25 ds1_data(20) e18 ds0_addr(08) h23 ds0_dqs(1) c24 ds1_data(21) c20 ds0_addr(09) j22 ds0_dqs(2) a24 ds1_data(22) e20 ds0_addr(10) a23 ds0_dqs(3) e25 ds1_data(23) n18 ds0_addr(11) c22 ds0_we j23 ds1_data(24) f19 ds0_addr(12) e24 ds1_addr(00) n17 ds1_data(25) e19 ds0_cs a31 ds1_addr(01) j18 ds1_data(26) a18 ds0_data(00) p19 ds1_addr(02) g18 ds1_data(27) c18 ds0_data(01) a32 ds1_addr(03) e17 ds1_data(28) k19 ds0_data(02) b31 ds1_addr(04) h17 ds1_data(29) g21 ds0_data(03) a33 ds1_addr(05) g19 ds1_data(30) g20 ds0_data(04) c30 ds1_addr(06) h19 ds1_data(31) j19 ds0_data(05) c31 ds1_addr(07) h15 ds1_dqs(0) b17 ds0_data(06) f27 ds1_addr(08) g15 ds1_dqs(1) c19 ds0_data(07) h21 ds1_addr(09) g17 ds1_dqs(2) d19 ds0_data(08) c29 ds1_addr(10) f17 ds1_dqs(3) r18 ds0_data(09) a29 ds1_addr(11) g16 ds1_we c17 ds0_data(10) e28 ds1_addr(12) j16 dum a01 ds0_data(11) d29 ds1_cs e16 gnd b10 ds0_data(12) e27 ds1_data(00) a22 gnd ah32 ds0_data(13) a30 ds1_data(01) g23 gnd ak04 ds0_data(14) a27 ds1_data(02) k21 gnd b24 ds0_data(15) c27 ds1_data(03) l20 gnd ah24 ds0_data(16) h25 ds1_data(04) f23 gnd ah28 ds0_data(17) g26 ds1_data(05) b21 gnd af30 table 2-33. complete signal pin listing by signal name (page 5 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 91 of 554 gnd af18 gnd am02 gnd ab04 gnd b32 gnd f06 gnd m26 gnd af12 gnd ak18 gnd k06 gnd b28 gnd am10 gnd ab26 gnd af16 gnd f20 gnd k02 gnd ah20 gnd am14 gnd t12 gnd b20 gnd ad06 gnd m22 gnd y17 gnd ak22 gnd r15 gnd w19 gnd ad24 gnd r19 gnd y20 gnd f10 gnd d04 gnd k20 gnd ad14 gnd h12 gnd f32 gnd t30 gnd af08 gnd w15 gnd f28 gnd d08 gnd y02 gnd ah06 gnd v12 gnd v30 gnd m08 gnd af04 gnd am20 gnd d12 gnd t16 gnd ah10 gnd h18 gnd v26 gnd af26 gnd p14 gnd d22 gnd af22 gnd p24 gnd y10 gnd f02 gnd m30 gnd p06 gnd ah02 gnd p17 gnd v22 gnd b06 gnd t04 gnd d26 gnd ah14 gnd m12 gnd y14 gnd b02 gnd ab30 gnd am24 gnd b14 gnd h16 gnd v04 gnd ad20 gnd k10 gnd am28 gnd d16 gnd h04 gnd v08 gnd ak26 gnd ab12 gnd v16 gnd u20 gnd h26 gnd am32 gnd ak30 gnd p20 gnd h30 gnd ad28 gnd ab16 gnd k32 gnd ad10 gnd ab18 gnd k28 gnd y24 gnd p10 gnd h08 gnd ad32 gnd m04 gnd p02 gnd t18 gnd d18 gnd p32 gnd ad02 gnd ab08 gnd d30 gnd y32 gnd t22 gnd ak16 gnd ak08 gnd am06 gnd t26 gnd ak12 gnd k24 gnd u14 table 2-33. complete signal pin listing by signal name (page 6 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 network processor preliminary physical description page 92 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 gnd y28 lu_data(00) u15 mc_grant_b(0) r20 gnd f14 lu_data(01) u13 mc_grant_b(1) p21 gnd v18 lu_data(02) t09 mgrant_a(0) v19 gnd ab22 lu_data(03) t07 mgrant_a(1) u19 gnd y06 lu_data(04) r07 mgrant_b(0) ag27 gnd k14 lu_data(05) r08 mgrant_b(1) ae25 gnd f24 lu_data(06) w06 mg_clk j07 gnd h22 lu_data(07) w05 mg_data j06 gnd t08 lu_data(08) u08 mg_nintr j08 gnd m18 lu_data(09) v07 operational c32 gnd m16 lu_data(10) v09 pci_ad(00) ab29 gnd u17 lu_data(11) u12 pci_ad(01) ab27 gnd p28 lu_data(12) v15 pci_ad(02) ab25 i_freeq_th v21 lu_data(13) w04 pci_ad(03) ac33 jtag_tck aa22 lu_data(14) v11 pci_ad(04) ac32 jtag_tdi w22 lu_data(15) w07 pci_ad(05) ac31 jtag_tdo aa23 lu_data(16) w08 pci_ad(06) ac30 jtag_tms u22 lu_data(17) y07 pci_ad(07) ac29 jtag_trst t25 lu_data(18) v13 pci_ad(08) ac27 lu_addr(00) aa09 lu_data(19) w13 pci_ad(09) ac26 lu_addr(01) y11 lu_data(20) w01 pci_ad(10) ac25 lu_addr(02) aa10 lu_data(21) w09 pci_ad(11) ac24 lu_addr(03) ab07 lu_data(22) y09 pci_ad(12) ad33 lu_addr(04) ac09 lu_data(23) aa06 pci_ad(13) ad31 lu_addr(05) ae06 lu_data(24) t15 pci_ad(14) ad29 lu_addr(06) ae07 lu_data(25) w10 pci_ad(15) ad27 lu_addr(07) ac01 lu_data(26) w12 pci_ad(16) af29 lu_addr(08) r04 lu_data(27) w14 pci_ad(17) af27 lu_addr(09) ag05 lu_data(28) aa07 pci_ad(18) ag33 lu_addr(10) ag06 lu_data(29) aa08 pci_ad(19) ag32 lu_addr(11) ac04 lu_data(30) u01 pci_ad(20) ag31 lu_addr(12) ad07 lu_data(31) w11 pci_ad(21) ag30 lu_addr(13) af07 lu_data(32) y05 pci_ad(22) ag29 lu_addr(14) ab05 lu_data(33) r05 pci_ad(23) ag28 lu_addr(15) ae08 lu_data(34) u10 pci_ad(24) ah29 lu_addr(16) ab09 lu_data(35) u03 pci_ad(25) aj33 lu_addr(17) aa05 lu_r_wrt ac08 pci_ad(26) aj32 lu_addr(18) aa11 mc_grant_a(0) y21 pci_ad(27) aj31 lu_clk aj03 mc_grant_a(1) w21 pci_ad(28) aj30 table 2-33. complete signal pin listing by signal name (page 7 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 93 of 554 pci_ad(29) aj29 sch_addr(01) b05 send_grant_b t19 pci_ad(30) ak33 sch_addr(02) a04 spare_tst_rcvr(0) u05 pci_ad(31) ak31 sch_addr(03) e06 spare_tst_rcvr(1) e03 pci_bus_m_int ac23 sch_addr(04) e07 spare_tst_rcvr(2) a03 pci_bus_nm_int g27 sch_addr(05) e04 spare_tst_rcvr(3) t01 pci_cbe(0) ac28 sch_addr(06) h07 spare_tst_rcvr(4) al01 pci_cbe(1) ad25 sch_addr(07) f05 spare_tst_rcvr(5) g03 pci_cbe(2) af31 sch_addr(08) f07 spare_tst_rcvr(6) v01 pci_cbe(3) ah31 sch_addr(09) c03 spare_tst_rcvr(7) v03 pci_clk af33 sch_addr(10) d05 spare_tst_rcvr(8) t03 pci_devsel ae29 sch_addr(11) a02 spare_tst_rcvr(9) u33 pci_frame ae26 sch_addr(12) c04 switch_bna r21 pci_grant al32 sch_addr(13) b03 switch_clk_a an05 pci_idsel ah33 sch_addr(14) c02 switch_clk_a al05 pci_inta am33 sch_addr(15) d03 switch_clk_b c05 pci_irdy ae27 sch_addr(16) b01 switch_clk_b a05 pci_par ae33 sch_addr(17) c01 testmode(0) v05 pci_perr ae31 sch_addr(18) e05 testmode(1) u06 pci_request al33 sch_clk c07 thermal_in u04 pci_serr ae32 sch_data(00) a10 thermal_out u02 pci_speed m07 sch_data(01) c10 unused u07 pci_stop ae30 sch_data(02) f09 unused n05 pci_trdy ae28 sch_data(03) j11 unused t11 pgm_gnd j09 sch_data(04) l12 unused n10 pgm_vdd l11 sch_data(05) g09 unused t13 pio(0) n09 sch_data(06) b09 unused n12 pio(1) n08 sch_data(07) a09 unused r16 pio(2) n06 sch_data(08) a06 unused n07 plla_gnd ag01 sch_data(09) e08 unused m11 plla_vdd aj01 sch_data(10) g06 unused r12 pllb_gnd g01 sch_data(11) g07 unused r11 pllb_v dd e01 sch_data(12) c06 unused r09 pllc_gnd g33 sch_data(13) d07 unused u11 pllc_v dd e33 sch_data(14) c09 unused r13 res_data t21 sch_data(15) a08 unused u09 res_sync u21 sch_data(16) j10 unused t05 rx_lbyte(0) u23 sch_data(17) g08 unused k09 rx_lbyte(1) v23 sch_r_wrt g05 unused p07 sch_addr(00) a07 send_grant_a w20 unused l08 table 2-33. complete signal pin listing by signal name (page 8 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 network processor preliminary physical description page 94 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 unused l09 v dd b12 v dd v06 unused r01 v dd ab10 v dd ad16 unused p13 v dd ah26 v dd d28 unused m09 v dd ak06 v dd b18 unused l01 v dd ak02 v dd b26 unused n22 v dd m32 v dd af10 unused m05 v dd y18 v dd am16 unused aa24 v dd k30 v dd u18 unused r14 v dd v32 vref1(0) ad23 unused r10 v dd v24 vref1(1) k11 unused p05 v dd t10 vref1(2) ac10 unused r06 v dd am22 vref2(0) ac19 unused p11 v dd af20 vref2(1) ad17 unused p09 v dd d32 vref2(2) ac13 v dd ad04 v dd p16 vref2(3) l14 v dd t20 v dd y16 vref2(4) k17 v dd af06 v dd ad26 vref2(5) l21 v dd ab28 v dd b04 vref2(6) y13 v dd aa13 v dd am30 vref2(7) n11 v dd k12 v dd y30 vref2(8) l10 v dd t17 v dd ah30 2.5 v p12 v dd h02 v dd y08 2.5 v d02 v dd n21 v dd f08 2.5 v ak10 v dd k08 v dd p04 2.5 v f26 v dd t14 v dd ak24 2.5 v h10 v dd t02 v dd p18 2.5 v d06 v dd t28 v dd ah18 2.5 v am18 v dd k18 v dd m24 2.5 v d14 v dd h28 v dd af32 2.5 v am12 v dd f22 v dd am08 2.5 v af02 v dd v17 v dd ak14 2.5 v f18 v dd v20 v dd ah12 2.5 v k04 v dd h24 v dd m06 2.5 v ab20 v dd p26 v dd h14 2.5 v y04 v dd u16 v dd f16 2.5 v ah22 v dd d10 v dd n13 2.5 v ah04 v dd d20 v dd ab02 2.5 v v10 v dd aa21 v dd ad22 2.5 v y12 v dd f04 v dd v14 2.5 v m10 table 2-33. complete signal pin listing by signal name (page 9 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 95 of 554 note: all unused pins should be left unused on the card. 2.5 v ah08 2.5 v ah16 3.3 v p30 2.5 v ad18 2.5 v am26 3.3 v t24 2.5 v v02 2.5 v af24 3.3 v p22 2.5 v d24 2.5 v ab06 3.3 v af28 2.5 v t06 2.5 v am04 3.3 v ab24 2.5 v b16 2.5 v f12 3.3 v f30 2.5 v h06 2.5 v b08 3.3 v ab32 2.5 v af14 2.5 v ak20 3.3 v v28 2.5 v m14 2.5 v k16 3.3 v h32 2.5 v ab14 2.5 v ak28 3.3 v k26 2.5 v b30 2.5 v p08 3.3 v ak32 2.5 v h20 2.5 v m20 3.3 v t32 2.5 v k22 2.5 v ad12 3.3 v ad30 2.5 v ad08 3.3 v y22 2.5 v m02 3.3 v m28 2.5 v b22 3.3 v y26 table 2-33. complete signal pin listing by signal name (page 10 of 10) signal name grid position signal name grid position signal name grid position
ibm powernp np4gs3 network processor preliminary physical description page 96 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 table 2-34. complete signal pin listing by grid position (page 1 of 10) grid position signal name grid position signal name grid position signal name a01 dum aa04 dasl_in_a(6) ab07 lu_addr(03) a02 sch_addr(11) aa05 lu_addr(17) ab08 gnd a03 spare_tst_rcvr(2) aa06 lu_data(23) ab09 lu_addr(16) a04 sch_addr(02) aa07 lu_data(28) ab10 v dd a05 switch_clk_b aa08 lu_data(29) ab11 d0_data(03) a06 sch_data(08) aa09 lu_addr(00) ab12 gnd a07 sch_addr(00) aa10 lu_addr(02) ab13 d0_data(28) a08 sch_data(15) aa11 lu_addr(18) ab14 2.5v a09 sch_data(07) aa12 d0_data(00) ab15 d0_addr(05) a10 sch_data(00) aa13 v dd ab16 gnd a11 d4_addr(01) aa14 d0_addr(04) ab17 d1_addr(00) a12 dd_ba(0) aa15 d1_data(06) ab18 gnd a13 d4_data(31) aa16 d3_data(01) ab19 d2_dqs(0) a14 d4_data(24) aa17 d3_addr(06) ab20 2.5v a15 d4_data(19) aa18 d2_data(03) ab21 d6_data(04) a16 d4_data(11) aa19 d2_addr(12) ab22 gnd a17 d4_data(04) aa20 da_ba(1) ab23 c405_debug_halt a18 ds1_data(26) aa21 v dd ab24 3.3v a19 ds1_data(19) aa22 jtag_tck ab25 pci_ad(02) a20 ds1_data(14) aa23 jtag_tdo ab26 gnd a21 ds1_data(06) aa24 unused ab27 pci_ad(01) a22 ds1_data(00) aa25 dmu_a(28) ab28 v dd a23 ds0_addr(10) aa26 dmu_a(27) ab29 pci_ad(00) a24 ds0_dqs(2) aa27 dmu_a(26) ab30 gnd a25 ds0_data(28) aa28 dmu_a(25) ab31 dmu_a(30) a26 ds0_data(20) aa29 dmu_a(24) ab32 3.3v a27 ds0_data(14) aa30 dmu_a(23) ab33 dmu_a(29) a28 ds0_addr(00) aa31 dmu_a(22) ac01 lu_addr(07) a29 ds0_data(09) aa32 dmu_a(21) ac02 dasl_in_a(5) a30 ds0_data(13) aa33 dmu_a(20) ac03 dasl_in_a(5) a31 ds0_cs ab01 dasl_out_a(2) ac04 lu_addr(11) a32 ds0_data(01) ab02 v dd ac05 dasl_out_a(3) a33 ds0_data(03) ab03 dasl_in_a(6) ac06 dasl_out_a(3) aa01 dasl_out_a(1) ab04 gnd ac07 dasl_out_a(4) aa02 dasl_out_a(2) ab05 lu_addr(14) ac08 lu_r_wrt aa03 dasl_in_a(7) ab06 2.5v ac09 lu_addr(04)
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 97 of 554 ac10 vref1(2) ad13 db_ba(1) ae16 d3_data(09) ac11 d0_data(13) ad14 gnd ae17 d2_data(08) ac12 d0_dqs(1) ad15 d1_cs ae18 d3_addr(05) ac13 vref2(2) ad16 v dd ae19 d3_addr(12) ac14 d0_addr(10) ad17 vref2(1) ae20 d2_data(07) ac15 d0_data(29) ad18 2.5v ae21 d2_data(09) ac16 d1_data(15) ad19 d3_we ae22 d6_data(14) ac17 d3_dqs(1) ad20 gnd ae23 d6_data(09) ac18 d3_addr(07) ad21 d2_addr(06) ae24 d6_addr(02) ac19 vref2(0) ad22 vdd ae25 mgrant_b(1) ac20 d2_addr(05) ad23 vref1(0) ae26 pci_frame ac21 d6_data(10) ad24 gnd ae27 pci_irdy ac22 d6_data(15) ad25 pci_cbe(1) ae28 pci_trdy ac23 pci_bus_m_int ad26 vdd ae29 pci_devsel ac24 pci_ad(11) ad27 pci_ad(15) ae30 pci_stop ac25 pci_ad(10) ad28 gnd ae31 pci_perr ac26 pci_ad(09) ad29 pci_ad(14) ae32 pci_serr ac27 pci_ad(08) ad30 3.3v ae33 pci_par ac28 pci_cbe(0) ad31 pci_ad(13) af01 dasl_in_a(1) ac29 pci_ad(07) ad32 gnd af02 2.5v ac30 pci_ad(06) ad33 pci_ad(12) af03 dasl_out_a(6) ac31 pci_ad(05) ae01 dasl_in_a(3) af04 gnd ac32 pci_ad(04) ae02 dasl_in_a(1) af05 dasl_out_a(6) ac33 pci_ad(03) ae03 dasl_in_a(3) af06 v dd ad01 dasl_in_a(4) ae04 dasl_out_a(5) af07 lu_addr(13) ad02 gnd ae05 dasl_out_a(5) af08 gnd ad03 dasl_in_a(4) ae06 lu_addr(05) af09 d0_data(20) ad04 v dd ae07 lu_addr(06) af10 v dd ad05 dasl_out_a(4) ae08 lu_addr(15) af11 d0_addr(12) ad06 gnd ae09 d0_data(11) af12 gnd ad07 lu_addr(12) ae10 d0_data(22) af13 d1_addr(06) ad08 2.5v ae11 d0_dqs(2) af14 2.5v ad09 d0_data(01) ae12 d1_data(00) af15 d3_data(14) ad10 gnd ae13 db_ras af16 gnd ad11 d0_data(23) ae14 d1_addr(07) af17 d3_addr(02) ad12 2.5v ae15 d3_data(03) af18 gnd table 2-34. complete signal pin listing by grid position (page 2 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 network processor preliminary physical description page 98 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 af19 d3_data(15) ag22 d2_data(10) ah25 d6_data(08) af20 v dd ag23 d2_addr(07) ah26 v dd af21 d6_dqs_par(01) ag24 d6_byteen(0) ah27 d6_addr(12) af22 gnd ag25 da_ras ah28 gnd af23 d6_byteen(1) ag26 d6_addr(03) ah29 pci_ad(24) af24 2.5v ag27 mgrant_b(0) ah30 vdd af25 d6_addr(04) ag28 pci_ad(23) ah31 pci_cbe(3) af26 gnd ag29 pci_ad(22) ah32 gnd af27 pci_ad(17) ag30 pci_ad(21) ah33 pci_idsel af28 3.3v ag31 pci_ad(20) aj01 plla_vdd af29 pci_ad(16) ag32 pci_ad(19) aj02 dasl_in_a(0) af30 gnd ag33 pci_ad(18) aj03 lu_clk af31 pci_cbe(2) ah01 dasl_in_a(2) aj04 de_clk af32 vdd ah02 gnd aj05 de_clk af33 pci_clk ah03 dasl_in_a(2) aj06 d0_data(14) ag01 plla_gnd ah04 2.5v aj07 d0_data(16) ag02 dasl_out_a(7) ah05 de_ba(1) aj08 d0_data(31) ag03 d0_data(09) ah06 gnd aj09 d0_addr(02) ag04 dasl_out_a(7) ah07 d0_data(10) aj10 d1_data(03) ag05 lu_addr(09) ah08 2.5v aj11 d1_data(07) ag06 lu_addr(10) ah09 d0_dqs(3) aj12 db_cas ag07 d0_data(02) ah10 gnd aj13 d1_addr(01) ag08 d0_data(21) ah11 d1_data(11) aj14 d1_dqs(0) ag09 d0_dqs(0) ah12 v dd aj15 d1_addr(12) ag10 d0_addr(11) ah13 d1_data(14) aj16 d3_addr(01) ag11 db_ba(0) ah14 gnd aj17 d3_addr(03) ag12 d1_addr(05) ah15 d1_addr(11) aj18 db_clk ag13 d3_data(00) ah16 2.5v aj19 d2_data(01) ag14 d3_data(02) ah17 d3_data(11) aj20 d2_data(04) ag15 d3_data(13) ah18 v dd aj21 d2_data(14) ag16 d3_data(10) ah19 db_clk aj22 d2_addr(00) ag17 d3_data(12) ah20 gnd aj23 d2_addr(11) ag18 d3_addr(04) ah21 d2_addr(01) aj24 d2_we ag19 d3_addr(00) ah22 2.5v aj25 d6_we ag20 d3_dqs(0) ah23 d2_addr(04) aj26 d6_data(12) ag21 d6_addr(11) ah24 gnd aj27 d6_addr(07) table 2-34. complete signal pin listing by grid position (page 3 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 99 of 554 aj28 d6_addr(09) ak31 pci_ad(31) am01 de_ba(0) aj29 pci_ad(29) ak32 3.3v am02 gnd aj30 pci_ad(28) ak33 pci_ad(30) am03 d0_data(05) aj31 pci_ad(27) al01 spare_tst_rcvr(4) am04 2.5v aj32 pci_ad(26) al02 de_cas am05 d0_data(27) aj33 pci_ad(25) al03 d0_data(12) am06 gnd ak01 dasl_in_a(0) al04 d0_data(08) am07 d0_addr(06) ak02 v dd al05 switch_clk_a am08 v dd ak03 de_ras al06 d0_data(26) am09 d0_we ak04 gnd al07 d0_data(19) am10 gnd ak05 d0_data(15) al08 d0_addr(07) am11 d0_addr(08) ak06 v dd al09 d0_data(25) am12 2.5v ak07 d0_data(30) al10 d0_addr(00) am13 d1_data(12) ak08 gnd al11 d0_addr(09) am14 gnd ak09 d1_data(04) al12 d1_data(02) am15 d1_addr(03) ak10 2.5v al13 d1_data(09) am16 v dd ak11 d1_data(08) al14 d1_addr(09) am17 d3_data(05) ak12 gnd al15 d3_data(06) am18 2.5v ak13 d1_addr(02) al16 d1_we am19 d2_data(12) ak14 v dd al17 d3_data(04) am20 gnd ak15 d3_data(07) al18 d3_cs am21 d2_addr(03) ak16 gnd al19 d3_addr(09) am22 v dd ak17 d3_addr(11) al20 d2_data(05) am23 d6_data(01) ak18 gnd al21 d2_addr(09) am24 gnd ak19 d3_addr(08) al22 d2_cs am25 da_clk ak20 2.5v al23 d6_data(00) am26 2.5v ak21 d2_data(13) al24 d6_data(07) am27 d6_data(03) ak22 gnd al25 da_clk am28 gnd ak23 d2_addr(10) al26 d6_data(02) am29 d6_parity(00) ak24 v dd al27 d6_addr(05) am30 v dd ak25 d2_dqs(1) al28 d6_addr(00) am31 d6_dqs(3) ak26 gnd al29 d6_addr(10) am32 gnd ak27 d6_data(13) al30 d6_dqs(0) am33 pci_inta ak28 2.5v al31 d6_cs an01 d0_data(06) ak29 d6_addr(08) al32 pci_grant an02 d0_data(04) ak30 gnd al33 pci_request an03 d0_data(07) table 2-34. complete signal pin listing by grid position (page 4 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 network processor preliminary physical description page 100 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 an04 d0_data(17) b07 d4_addr(12) c10 sch_data(01) an05 switch_clk_a b08 2.5v c11 d4_addr(06) an06 d0_addr(03) b09 sch_data(06) c12 d4_addr(00) an07 d0_data(18) b10 gnd c13 dd_clk an08 d0_data(24) b11 d4_addr(07) c14 d4_data(17) an09 d0_cs b12 v dd c15 d4_data(03) an10 d0_addr(01) b13 dd_clk c16 d4_data(10) an11 d1_data(01) b14 gnd c17 ds1_we an12 d1_data(10) b15 d4_data(25) c18 ds1_data(27) an13 d1_data(13) b16 2.5v c19 ds1_dqs(1) an14 d1_addr(04) b17 ds1_dqs(0) c20 ds1_data(21) an15 d1_addr(08) b18 v dd c21 dc_ras an16 d1_dqs(1) b19 ds1_data(13) c22 ds0_addr(11) an17 d3_addr(10) b20 gnd c23 ds0_addr(06) an18 d2_data(00) b21 ds1_data(05) c24 ds0_dqs(1) an19 d2_data(06) b22 2.5v c25 ds0_data(21) an20 d2_data(11) b23 ds0_addr(05) c26 ds0_addr(04) an21 d2_addr(02) b24 gnd c27 ds0_data(15) an22 d2_addr(08) b25 ds0_data(29) c28 ds0_data(22) an23 d6_parity(01) b26 v dd c29 ds0_data(08) an24 d6_data(06) b27 ds0_addr(03) c30 ds0_data(04) an25 d6_data(11) b28 gnd c31 ds0_data(05) an26 d6_addr(01) b29 ds0_data(23) c32 operational an27 da_cas b30 2.5v c33 core_clock an28 d6_data(05) b31 ds0_data(02) d01 dasl_in_b(0) an29 da_ba(0) b32 gnd d02 2.5v an30 d6_addr(06) b33 clock125 d03 sch_addr(15) an31 d6_dqs(1) c01 sch_addr(17) d04 gnd an32 d6_dqs_par(00) c02 sch_addr(14) d05 sch_addr(10) an33 d6_dqs(2) c03 sch_addr(09) d06 2.5v b01 sch_addr(16) c04 sch_addr(12) d07 sch_data(13) b02 gnd c05 switch_clk_b d08 gnd b03 sch_addr(13) c06 sch_data(12) d09 d4_addr(08) b04 v dd c07 sch_clk d10 v dd b05 sch_addr(01) c08 d4_addr(09) d11 d4_dqs(0) b06 gnd c09 sch_data(14) d12 gnd table 2-34. complete signal pin listing by grid position (page 5 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 101 of 554 d13 d4_data(26) e16 ds1_cs f19 ds1_data(24) d14 2.5v e17 ds1_addr(03) f20 gnd d15 d4_data(02) e18 ds1_data(20) f21 ds1_data(07) d16 gnd e19 ds1_data(25) f22 v dd d17 d4_data(05) e20 ds1_data(22) f23 ds1_data(04) d18 gnd e21 ds1_data(11) f24 gnd d19 ds1_dqs(2) e22 ds1_data(08) f25 ds0_dqs(0) d20 v dd e23 dc_clk f26 2.5v d21 ds1_data(12) e24 ds0_addr(12) f27 ds0_data(06) d22 gnd e25 ds0_dqs(3) f28 gnd d23 dc_clk e26 ds0_data(27) f29 dmu_d(07) d24 2.5v e27 ds0_data(12) f30 3.3v d25 dc_ba(0) e28 ds0_data(10) f31 dmu_d(06) d26 gnd e29 blade_reset f32 gnd d27 ds0_data(26) e30 dmu_d(04) f33 dmu_d(05) d28 v dd e31 dmu_d(29) g01 pllb_gnd d29 ds0_data(11) e32 dmu_d(12) g02 dasl_in_b(3) d30 gnd e33 pllc_v dd g03 spare_tst_rcvr(5) d31 dmu_d(01) f01 dasl_in_b(2) g04 dasl_in_b(3) d32 v dd f02 gnd g05 sch_r_wrt d33 dmu_d(00) f03 dasl_in_b(2) g06 sch_data(10) e01 pllb_v dd f04 v dd g07 sch_data(11) e02 dasl_in_b(0) f05 sch_addr(07) g08 sch_data(17) e03 spare_tst_rcvr(1) f06 gnd g09 sch_data(05) e04 sch_addr(05) f07 sch_addr(08) g10 d4_addr(04) e05 sch_addr(18) f08 v dd g11 dd_cas e06 sch_addr(03) f09 sch_data(02) g12 d4_data(22) e07 sch_addr(04) f10 gnd g13 d4_data(08) e08 sch_data(09) f11 d4_we g14 d4_data(07) e09 d4_data(18) f12 2.5v g15 ds1_addr(08) e10 d4_cs f13 d4_data(30) g16 ds1_addr(11) e11 d4_dqs(1) f14 gnd g17 ds1_addr(09) e12 d4_data(29) f15 d4_data(13) g18 ds1_addr(02) e13 d4_data(27) f16 v dd g19 ds1_addr(05) e14 d4_data(16) f17 ds1_addr(10) g20 ds1_data(30) e15 d4_data(12) f18 2.5v g21 ds1_data(29) table 2-34. complete signal pin listing by grid position (page 6 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 network processor preliminary physical description page 102 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 g22 ds1_data(15) h25 ds0_data(16) j28 dmu_d(22) g23 ds1_data(01) h26 gnd j29 dmu_d(03) g24 ds0_addr(07) h27 dmu_d(16) j30 dmu_d(20) g25 ds0_data(30) h28 v dd j31 dmu_d(19) g26 ds0_data(17) h29 dmu_d(15) j32 dmu_d(18) g27 pci_bus_nm_int h30 gnd j33 dmu_d(17) g28 dmu_d(02) h31 dmu_d(14) k01 dasl_in_b(5) g29 dmu_d(11) h32 3.3v k02 gnd g30 dmu_d(10) h33 dmu_d(13) k03 dasl_in_b(5) g31 dmu_d(30) j01 dasl_in_b(4) k04 2.5v g32 dmu_d(08) j02 dasl_in_b(1) k05 dasl_out_b(5) g33 pllc_gnd j03 dasl_in_b(4) k06 gnd h01 dasl_in_b(1) j04 dasl_out_b(6) k07 boot_picocode h02 v dd j05 dasl_out_b(6) k08 v dd h03 dasl_out_b(7) j06 mg_data k09 unused h04 gnd j07 mg_clk k10 gnd h05 dasl_out_b(7) j08 mg_nintr k11 vref1(1) h06 2.5v j09 pgm_gnd k12 v dd h07 sch_addr(06) j10 sch_data(16) k13 dd_ras h08 gnd j11 sch_data(03) k14 gnd h09 d4_data(06) j12 d4_addr(02) k15 d4_data(09) h10 2.5v j13 dd_ba(1) k16 2.5v h11 d4_addr(03) j14 d4_data(21) k17 vref2(4) h12 gnd j15 ds1_data(17) k18 v dd h13 d4_data(23) j16 ds1_addr(12) k19 ds1_data(28) h14 v dd j17 dc_ba(1) k20 gnd h15 ds1_addr(07) j18 ds1_addr(01) k21 ds1_data(02) h16 gnd j19 ds1_data(31) k22 2.5v h17 ds1_addr(04) j20 ds1_data(18) k23 ds0_data(19) h18 gnd j21 ds1_data(16) k24 gnd h19 ds1_addr(06) j22 ds0_addr(09) k25 dmu_d(09) h20 2.5v j23 ds0_we k26 3.3v h21 ds0_data(07) j24 ds0_data(18) k27 dmu_d(21) h22 gnd j25 dmu_d(25) k28 gnd h23 ds0_addr(08) j26 dmu_d(24) k29 dmu_d(28) h24 v dd j27 dmu_d(23) k30 v dd table 2-34. complete signal pin listing by grid position (page 7 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 103 of 554 k31 dmu_d(27) m02 2.5v n06 pio(2) k32 gnd m03 dasl_in_b(7) n07 unused k33 dmu_d(26) m04 gnd n08 pio(1) l01 unused m05 unused n09 pio(0) l02 dasl_in_b(6) m06 v dd n10 unused l03 dasl_in_b(6) m07 pci_speed n11 vref2(7) l04 boot_ppc m08 gnd n12 unused l05 dasl_out_b(4) m09 unused n13 vdd l06 dasl_out_b(4) m10 2.5v n14 d4_addr(11) l07 dasl_out_b(5) m11 unused n15 d4_dqs(2) l08 unused m12 gnd n16 d4_data(15) l09 unused m13 d4_addr(10) n17 ds1_addr(00) l10 vref2(8) m14 2.5v n18 ds1_data(23) l11 pgm_vdd m15 d4_dqs(3) n19 dc_cas l12 sch_data(04) m16 gnd n20 ds0_addr(01) l13 d4_addr(05) m17 d4_data(00) n21 v dd l14 vref2(3) m18 gnd n22 unused l15 d4_data(20) m19 ds0_addr(02) n23 dmu_c(27) l16 d4_data(01) m20 2.5v n24 dmu_c(26) l17 ds1_data(10) m21 ds0_data(24) n25 dmu_c(25) l18 ds1_data(09) m22 gnd n26 dmu_c(24) l19 ds0_data(25) m23 dmu_c(16) n27 dmu_c(23) l20 ds1_data(03) m24 v dd n28 dmu_c(22) l21 vref2(5) m25 dmu_c(15) n29 dmu_c(21) l22 ds0_data(31) m26 gnd n30 dmu_c(20) l23 dmu_c(10) m27 dmu_c(14) n31 dmu_c(19) l24 dmu_c(09) m28 3.3v n32 dmu_c(18) l25 dmu_c(08) m29 dmu_c(13) n33 dmu_c(17) l26 dmu_c(07) m30 gnd p01 dasl_out_b(0) l27 dmu_c(06) m31 dmu_c(12) p02 gnd l28 dmu_c(05) m32 v dd p03 dasl_out_b(3) l29 dmu_c(04) m33 dmu_c(11) p04 v dd l30 dmu_c(03) n01 dasl_out_b(1) p05 unused l31 dmu_c(02) n02 dasl_out_b(2) p06 gnd l32 dmu_c(01) n03 dasl_out_b(3) p07 unused l33 dmu_c(00) n04 dasl_in_b(7) p08 2.5v m01 dasl_out_b(2) n05 unused p09 unused table 2-34. complete signal pin listing by grid position (page 8 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 network processor preliminary physical description page 104 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 p10 gnd r14 unused t18 gnd p11 unused r15 gnd t19 send_grant_b p12 2.5v r16 unused t20 v dd p13 unused r17 d4_data(14) t21 res_data p14 gnd r18 ds1_dqs(3) t22 gnd p15 d4_data(28) r19 gnd t23 dmu_b(19) p16 v dd r20 mc_grant_b(0) t24 3.3v p17 gnd r21 switch_bna t25 jtag_trst p18 v dd r22 dmu_b(18) t26 gnd p19 ds0_data(00) r23 dmu_b(13) t27 dmu_b(17) p20 gnd r24 dmu_b(12) t28 v dd p21 mc_grant_b(1) r25 dmu_b(11) t29 dmu_b(16) p22 3.3v r26 dmu_b(10) t30 gnd p23 dmu_b(02) r27 dmu_b(09) t31 dmu_b(15) p24 gnd r28 dmu_b(08) t32 3.3v p25 dmu_b(01) r29 dmu_b(07) t33 dmu_b(14) p26 v dd r30 dmu_b(06) u01 lu_data(30) p27 dmu_b(00) r31 dmu_b(05) u02 thermal_out p28 gnd r32 dmu_b(04) u03 lu_data(35) p29 dmu_c(30) r33 dmu_b(03) u04 thermal_in p30 3.3v t01 spare_tst_rcvr(3) u05 spare_tst_rcvr(0) p31 dmu_c(29) t02 v dd u06 testmode(1) p32 gnd t03 spare_tst_rcvr(8) u07 unused p33 dmu_c(28) t04 gnd u08 lu_data(08) r01 unused t05 unused u09 unused r02 dasl_out_b(0) t06 2.5v u10 lu_data(34) r03 dasl_out_b(1) t07 lu_data(03) u11 unused r04 lu_addr(08) t08 gnd u12 lu_data(11) r05 lu_data(33) t09 lu_data(02) u13 lu_data(01) r06 unused t10 v dd u14 gnd r07 lu_data(04) t11 unused u15 lu_data(00) r08 lu_data(05) t12 gnd u16 v dd r09 unused t13 unused u17 gnd r10 unused t14 v dd u18 v dd r11 unused t15 lu_data(24) u19 mgrant_a(1) r12 unused t16 gnd u20 gnd r13 unused t17 v dd u21 res_sync table 2-34. complete signal pin listing by grid position (page 9 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 105 of 554 u22 jtag_tms v26 gnd w30 dmu_a(06) u23 rx_lbyte(0) v27 dmu_a(02) w31 dmu_a(05) u24 dmu_b(29) v28 3.3v w32 dmu_a(04) u25 dmu_b(28) v29 dmu_a(01) w33 dmu_a(03) u26 dmu_b(27) v30 gnd y01 dasl_out_a(0) u27 dmu_b(26) v31 dmu_a(00) y02 gnd u28 dmu_b(25) v32 v dd y03 dasl_in_a(7) u29 dmu_b(24) v33 dmu_b(30) y04 2.5v u30 dmu_b(23) w01 lu_data(20) y05 lu_data(32) u31 dmu_b(22) w02 dasl_out_a(0) y06 gnd u32 dmu_b(21) w03 dasl_out_a(1) y07 lu_data(17) u33 spare_tst_rcvr(9) w04 lu_data(13) y08 v dd v01 spare_tst_rcvr(6) w05 lu_data(07) y09 lu_data(22) v02 2.5v w06 lu_data(06) y10 gnd v03 spare_tst_rcvr(7) w07 lu_data(15) y11 lu_addr(01) v04 gnd w08 lu_data(16) y12 2.5v v05 testmode(0) w09 lu_data(21) y13 vref2(6) v06 v dd w10 lu_data(25) y14 gnd v07 lu_data(09) w11 lu_data(31) y15 d1_data(05) v08 gnd w12 lu_data(26) y16 v dd v09 lu_data(10) w13 lu_data(19) y17 gnd v10 2.5v w14 lu_data(27) y18 v dd v11 lu_data(14) w15 gnd y19 d2_data(15) v12 gnd w16 d1_addr(10) y20 gnd v13 lu_data(18) w17 d3_data(08) y21 mc_grant_a(0) v14 v dd w18 d2_data(02) y22 3.3v v15 lu_data(12) w19 gnd y23 dmu_a(19) v16 gnd w20 send_grant_a y24 gnd v17 v dd w21 mc_grant_a(1) y25 dmu_a(18) v18 gnd w22 jtag_tdi y26 3.3v v19 mgrant_a(0) w23 dmu_a(13) y27 dmu_a(17) v20 v dd w24 dmu_a(12) y28 gnd v21 i_freeq_th w25 dmu_a(11) y29 dmu_a(16) v22 gnd w26 dmu_a(10) y30 v dd v23 rx_lbyte(1) w27 dmu_a(09) y31 dmu_a(15) v24 v dd w28 dmu_a(08) y32 gnd v25 dmu_b(20) w29 dmu_a(07) y33 dmu_a(14) table 2-34. complete signal pin listing by grid position (page 10 of 10) grid position signal name grid position signal name grid position signal name
ibm powernp np4gs3 network processor preliminary physical description page 106 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001 2.5 ieee 1149 (jtag) compliance 2.5.1 statement of jtag compliance the np4gs3 is compliant with ieee standard 1149.1a. 2.5.2 jtag compliance mode compliance with ieee 1149.1a is enabled by applying a compliance-enable pattern to the compliance-enable inputs as shown in table 2-35 . 2.5.3 jtag implementation specifics all mandatory jtag public instructions are implemented in the np4gs3 ? s design. table 2-36 documents all implemented public instructions. 2.5.4 brief overview of jtag instructions table 2-35. jtag compliance-enable inputs compliance-enable inputs compliance-enable pattern testmode(1:0) ? 10 ? spare_tst_rcvr(9) 1 spare_tst_rcvr(4) 1 spare_tst_rcvr(3) 1 spare_tst_rcvr(2) 1 note: to achieve reset of the jtag test logic, the jtag_trst input must be driven low when the compliance-enable pattern is applied. table 2-36. implemented jtag public instructions instruction name binary code 1 serial data reg connected to tdi/tdo i/o control source (driver data and driver enable) bypass 111 1111 bypass system, functional values clamp 111 1101 bypass jtag extest 111 1000 boundaryscan jtag highz 111 1010 bypass jtag, all drivers disabled sample/preload 111 1001 boundaryscan system, functional values 1. device tdo output driver is only enabled during tap controller states shift_ir and shift_dr. instruction description bypass connects the bypass data register between the tdi and tdo pins. the bypass data register is a single shift-register stage that provides a minimum-length serial path between the tdi and tdo pins when no test operation of the device is required. bypass does not disturb the normal functional connection and control of the i/o pins.
ibm powernp np4gs3 preliminary network processor np3_dl_sec02_phys.fm.08 may 18, 2001 physical description page 107 of 554 clamp causes all output pins to be driven from the corresponding jtag parallel boundary scan register. the sample/preload instruction is typically used to load the desired values into the parallel boundary scan register. since the clamp instruction causes the serial tdi/tdo path to be connected to the bypass data registers, scanning through the device is very fast when the clamp instruction is loaded. extest allows the jtag logic to control output pin values by connecting each output data and enable signal to its corresponding parallel boundary scan register. the desired controlling values for the output data and enable signals are shifted into the scan boundary scan register during the shiftdr state. the paral- lel boundary scan register is loaded from the scan boundary scan register during the updatedr state. the extest instruction also allows the jtag logic to sample input receiver and output enable values. the values are loaded into the scan boundary scan register during capturedr state. highz causes the jtag logic to tri-state all output drivers while connecting the bypass register in the serial tdi/tdo path. sample/preload allows the jtag logic to sample input pin values and load the parallel boundary scan register without disturbing the normal functional connection and control of the i/o pins. the sample phase of the instruction occurs in capturedr state, at which time the scan boundary scan register is loaded with the corresponding input receiver and output enable values. (note that for input pins that are connected to a common i/o, the scan boundary scan register only updates with a input receiver sample if the corre- sponding output driver of the common i/o is disabled; otherwise the scan register is updated with the output data signal.) the desired controlling values for the output pins are shifted into the scan boundary scan register dur- ing the shiftdr state and loaded from the scan boundary scan register to the parallel boundary scan register during the updatedr state. instruction description
ibm powernp np4gs3 network processor preliminary physical description page 108 of 554 np3_dl_sec02_phys.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 109 of 554 3. physical mac multiplexer the physical mac multiplexer (pmm) moves data between physical layer devices and the network processor. the pmm interfaces with the network processor ? s external ports in the ingress (i-pmm) and egress (e-pmm) directions. the pmm includes five data mover units (dmus), as illustrated in figure 3-1 .fourdmus(a,b,c,andd) can each be independently configured as an ethernet medium access control (mac) or a packet over sonet (pos) mac. the device keeps a complete set of performance statistics on a per-port basis in either mode. each dmu moves data at 1 gigabit per second (gbps) in both the ingress and the egress directions. the wrap dmu enables traffic generated by the np4gs3 ? s egress side to move to the ingress side of the device and up to the switch fabric. figure 3-1. pmm overview w d c b np4gs3 ingress eds dmu a b c d pmm w egress eds dmu a
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 110 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 3.1 ethernet overview a dmu configured in ethernet mode can support either one port of gigabit ethernet or ten ports of fast (10/ 100) ethernet. when in gigabit ethernet mode, each dmu can be configured with either a gigabit media- independent interface (gmii) or a ten-bit interface (tbi). when in fast ethernet mode, each dmu can be configured with a serial media-independent interface (smii). in this mode, the single dmu mac can be configured as ten ports. each of these four dmus can be configured in any of the ethernet operation modes. figure 3-2: ethernet mode on page 110 shows a sample np4gs3 with the pmm configured for ethernet interfaces. dmus a and b are configured as gigabit ethernet macs. dmus c and d are configured as fast ethernet macs. figure 3-2. ethernet mode dmu - a np4gs3 phy gmii or tbi 1 gigabit ethernet port dmu - b phy 1 gigabit ethernet port dmu - c phy dmu - d phy smii smii up to ten fast ethernet ports gmii or tbi up to ten fast ethernet ports
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 111 of 554 3.1.1 ethernet interface timing diagrams the following figures show timing diagrams for the ethernet interfaces: figure 3-3: smii timing diagram on page 111 figure 3-4: gmii timing diagram on page 111 figure 3-5: tbi timing diagram on page 112 figure 3-6: gmii pos mode timing diagram on page 113. figure 3-3. smii timing diagram figure 3-4. gmii timing diagram clock sync rx crs rx_dv rxd0 rxd1 rxd2 rxd3 rxd4 rxd5 rxd6 rxd7 crs tx_er tx_en txd0 txd1 txd2 txd3 txd4 txd5 txd6 txd7 tx_er tx rx_clk rx_dv rxd<7:0> rx_er fcs preamble sfd receive tx_clk tx_en txd<7:0> tx_er fcs preamble transmit
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 112 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 figure 3-5. tbi timing diagram tr i tx_clk tx_en txd[7:0] tx_er fcs preamble tx code group i=idle s=startofpacket d=data t=endofpacket(part1) r = end of packet (part 2 or 3) isd d dd ddddd dd ddd rx_clk1 2.0 v rx_code_group[9:0] 1.4 v 0.8 v rx_clk0 t hold comma code-group valid data t setup t hold t setup receive clocks and receive data transmit data
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 113 of 554 3.1.2 ethernet counters the np4gs3 provides 36 ethernet statistics counters per mac (up to one million software-defined, hard- ware-assisted counters), enabling support of many standard management information bases (mibs) at wire speed. table 3-1: ingress ethernet counters on page 114 and table 3-2: egress ethernet counters on page 116 show the statistics counters kept in each dmu when it operates in ethernet mode. these counters are acces- sible through the control access bus (cab) interface. figure 3-6. gmii pos mode timing diagram rx_data receive valid byte tx_data transmit byte credit valid skip byte do not send ok to send valid valid valid do not send ok to send ok to send transmit valid byte
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 114 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 table 3-1. ingress ethernet counters (page 1 of 2) name / group counter no. group description notes short frames x ? 00 ? 4 total number of frames received that were less than 64 octets long and were otherwise well-formed (had a good crc) 1, 2 fragments x ? 01 ? total number of frames received that were less than 64 octets long and had a bad crc 1, 2 frames received (64 octets) x ? 02 ? total number of frames (including bad frames) received that were 64 octets in length 1, 2 frames received (65 to 127 octets) x ? 03 ? total number of frames (including bad frames) received that were between 65 and 127 octets in length inclusive 1, 2 frames received (128 to 255 octets) x ? 04 ? total number of frames (including bad frames) received that were between 128 and 255 octets in length inclusive 1, 2 frames received (256 to 511 octets) x ? 05 ? total number of frames (including bad frames) received that were between 256 and 511 octets in length inclusive 1, 2 frames received (512 to 1023 octets) x ? 06 ? total number of frames (including bad frames) received that were between 512 and 1023 octets in length inclusive 1, 2 frames received (1024 to 1518 octets) x ? 07 ? total number of frames (including bad frames) received that were between 1024 and 1518 octets in length inclusive 1, 2 jumbo frames x ? 14 ? total number of frames (including bad frames) received with a length between 1519 and 9018 octets, or between 1519 and 9022 octets if vlan is asserted. if jumbo is not asserted the frame is a long frame. long frames received x ? 08 ? total number of long frames received with a good crc that were either: 1) longer than 1518 octets (excluding framing bits, but including crc octets), and were otherwise well-formed (good crc), vlan and jumbo deasserted 2) vlan frames that were longer than 1522 octets, with jumbo deas- serted 3) jumbo frames that were longer than 9018 octets, with vlan deas- serted 4) jumbo vlan frames that were longer than 9022 octets 3 jabber x ? 09 ? total number of long frames received with bad crc that were either: 1) longer than 1518 octets (excluding framing bits, but including crc octets), and were not well-formed (bad crc), vlan and jumbo not asserted 2) vlan frames that were longer than 1522 octets, with jumbo deas- serted 3) jumbo frames that were longer than 9018 octets, with vlan deas- serted 4) jumbo vlan frames that were longer than 9022 octets 3 1. the states of vlan or jumbo have no effect on this count. 2. excluding framing bits but including crc octets 3. reception of frames meeting the criteria for this counter are aborted by the hardware. abort is indicated in the ingress port control block and in the ingress frame control block. if the frame has not been forwarded by the picocode, the picocode must enqueue the frame to the ingress discard queue. if the frame has been forwarded, then the egress hardware will discard the frame. further reception at the mac is inhibited until the next frame starts.
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 115 of 554 frames with bad crc x ? 0a ? 3 total number of frames received that were not long frames, but had bad crc 2 unicast frames received x ? 0b ? total number of good frames received that were directed to the unicast address (excluding multicast frames, broadcast frames, or long frames) broadcast frames received x ? 0c ? total number of good frames received that were directed to the broadcast address (excluding multicast frames or long frames) multicast frames received x ? 0d ? total number of good frames received that were directed to the multicast address (excluding broadcast frames, or long frames) total frames received x ? 0e ? 2 total number of frames (including bad frames, broadcast frames, multicast frames, unicast frames, and long frames) received receive errors x ? 0f ? total number of frames received in which the phy detected an error and asserted the rx_err signal overruns x ? 13 ? total number of frames received when the pmm internal buffer was full and the frame couldn ? t be stored. includes frames in the process of being received when the pmm internal buffer becomes full. pause frames x ? 10 ? total number of mac pause frames received that were well-formed and had a good crc total pause time x ? 11 ? 1 total amount of time spent in a pause condition as a result of receiving a good pause mac frame total octets received x ? 12 ? 0 total number of octets of data (including those in bad frames) received on the network 2 table 3-1. ingress ethernet counters (page 2 of 2) name / group counter no. group description notes 1. the states of vlan or jumbo have no effect on this count. 2. excluding framing bits but including crc octets 3. reception of frames meeting the criteria for this counter are aborted by the hardware. abort is indicated in the ingress port control block and in the ingress frame control block. if the frame has not been forwarded by the picocode, the picocode must enqueue the frame to the ingress discard queue. if the frame has been forwarded, then the egress hardware will discard the frame. further reception at the mac is inhibited until the next frame starts.
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 116 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 table 3-2. egress ethernet counters (page 1 of 2) name counter no. group description cnt short frames x ? 00 ? 4 total number of frames transmitted with less than 64 octets that had good crc. 1, 2 runt frames (bad crc) x ? 01 ? total number of frames transmitted with less than 64 octets that had bad crc. 1, 2 frames transmitted (64 octets) x ? 02 ? total number of frames (including bad frames) transmitted that were 64 octets in length. 1 frames transmitted (65to127octets) x ? 03 ? total number of frames (including bad frames) transmitted that were between 65 and 127 octets in length inclusive. 1, 3 frames transmitted (128 to 255 octets) x ? 04 ? total number of frames (including bad frames) transmitted that were between 128 and 255 octets in length inclusive 1, 3 frames transmitted (256 to 511 octets) x ? 05 ? total number of frames (including bad frames) transmitted that were between 256 and 511 octets in length inclusive. 1, 3 frames transmitted (512 to 1023 octets) x ? 06 ? total number of frames (including bad frames) transmitted that were between 512 and 1023 octets in length inclusive. 1, 3 frames transmitted (1024 to 1518 octets) x ? 07 ? total number of frames (including bad frames) transmitted that were between 1024 and 1518 octets in length inclusive. 1, 3 jumbo frames x ? 16 ? total number of frames (including bad frames) transmitted with a length between 1519 and 9018 octets, or between 1523 and 9022 if vlan is asserted. if jumbo is not asserted, the frame is a long frame. long frames transmitted x ? 08 ? total number of frames with good crc transmitted that were either: 1) longer than 1518 octets (excluding framing bits, but including crc octets), and were otherwise well-formed (good crc), vlan and jumbo are deasserted 2) vlan frames that were longer than 1522 octets with jumbo deasserted 3) jumbo frames that were longer than 9018 octets with jumbo deasserted 4) jumbo vlan frames that were longer than 9022 octets jabber x ? 09 ? total number of frames with bad crc transmitted that were either: 1) longer than 1518 octets (excluding framing bits, but including crc octets), and were otherwise well-formed (good crc), vlan and jumbo are deasserted 2) vlan frames that were longer than 1522 octets with jumbo deasserted 3) jumbo frames that were longer than 9018 octets with jumbo deasserted 4) jumbo vlan frames that were longer than 9022 octets late collisions x ? 0a ? total number of frames transmitted that experienced a network collision after 64 bytes of the frame had been transmitted. 1 1. the states of vlan or jumbo have no effect on this count. 2. including the frame type byte. 3. excluding framing bits but including crc octets.
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 117 of 554 total collisions x ? 0b ? 3 best estimate of the total number of collisions on this ethernet segment 1 single collisions x ? 0c ? total number of frames transmitted that experienced one collision before 64 bytes of the frame were transmitted on the network 1 multiple collisions x ? 0d ? total number of frames transmitted that experienced more than one col- lision before 64 bytes of the frame were transmitted on the network 1 excessive deferrals x ? 0e ? total number of frames whose transmission could not be started before the deferral time out expired. the deferral time out value is 24,416 media bit times. 1 underruns x ? 0f ? total number of frames that were not completely transmitted because the data could not be obtained from the egress eds fast enough to maintain the media data rate aborted frames x ? 17 ? total number of frames that had either link status pointer parity errors, or had a header qsb field point beyond eof, and were aborted by the egress pmm crc error x ? 10 ? 2 total number of frames transmitted that had a legal length (excluding framing bit, but including crc octets) of between 64 and 9018 octets, (64 and 9022 for vlan frames) inclusive, but had a bad crc excessive collisions x ? 11 ? total number of frames that experienced more than 16 collisions during transmit attempts. these frames are dropped and not transmitted unicast frames transmitted x ? 12 ? total number of good frames transmitted that were directed to the uni- cast address (not including multicast frames or broadcast frames) broadcast frames transmitted x ? 13 ? total number of good frames transmitted that were directed to the broadcast address (not including multicast frames) multicast frames transmitted x ? 14 ? total number of good frames transmitted that were directed to the multi- cast address (not including broadcast frames) total octets transmitted x ? 15 ? 0 total number of octets of data (including those in bad frames) transmit- ted on the network 3 table 3-2. egress ethernet counters (page 2 of 2) name counter no. group description cnt 1. the states of vlan or jumbo have no effect on this count. 2. including the frame type byte. 3. excluding framing bits but including crc octets.
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 118 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 3.1.3 ethernet support table 3-3 lists the features and standards supported by a dmu in the different ethernet modes. table 3-3. ethernet support (page 1 of 2) feature ethernet mode fast (smii) gigabit (gmii) gigabit (tbi) fully compatible with ieee standard 802.3: 1993 (e) x 802.3u / d5.3 x 802.3z / d4 standards x 802.3 clause 36 (tbi) standards x compliant with rfc 1757 management registers and counters (tx/rx counters) additional receive counters for total number of pause frames and total aggregate pause time additional transmit counters for number of single collisions and number of multiple colli- sions xx supports the ieee standards on flow control by honoring received pause frames and inhibiting frame transmission while maintaining pause counter statistics xx supports smii to the phy interfaces with up to ten phys that support the smii interface. each of the ten interfaces can have a bit rate of either 10 mbps or 100 mbps. x capable of handling ten ports of 10 mbps or 100 mbps media speeds, any speed mix x supports half-duplex operations at media speed on all ports x supports binary exponential back-off (beb) compliant with the ieee std. 802.3: 1993 (e) x supports full duplex point-to-point operations at media speed x x detects vlan (8100 frame type) ethernet frames and accounts for them when calculating long frames xx supports two ethernet frame types (programmable) and, based on these, detects received frames with a type field that matches one of those types. a match instructs the multi-port 10/100 mac to strip the da, sa, and type fields from the received frame and to instruct the higher device func- tions to queue the frame in a different queue. an example of a special function is to identify ethernet encapsulated guided frames. a mismatch results in normal frame queueing and normal higher device processing. xx programmable cyclic redundancy check (crc) insertion on a frame basis with crc insertion disabled, mac transmits frames as is (suitable for switch environments) with crc insertion enabled, the mac calculates and inserts the crc xx includes jumbo frame support. when configured, can transmit and receive up to 9018-byte non- vlan frames or up to 9022-byte vlan frames. transfers received data to the upper device layers using a proprietary 16-byte wide interface. in the transmit direction, data from the data mover is sent to the mac on a single 8-bit bus. xx supports the gigabit medium independent interface (gmii) to the physical tbi layer x supports the ibm tbi ? valid byte ? signal x can be combined with the tbi to form a complete tbi solution x interfaces with any pma/pmi physical layer using the pma service interface defined in the ieee 802.3 standard x synchronizes the data received from the pma (two phase) clock with the mac (single phase) clock. provides a signal to the mac indicating those clock cycles that contain new data. x checks the received code groups (10 bits) for commas and establishes word synchronization x
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 119 of 554 calculates and checks the tbi running disparity x supports auto-negotiation including two next pages x interfaces with the 1000 mbps ethernet mac through the gmii to form a complete tbi solution x table 3-3. ethernet support (page 2 of 2) feature ethernet mode fast (smii) gigabit (gmii) gigabit (tbi)
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 120 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 3.2 pos overview a dmu configured in packet over sonet (pos) mode can support both clear-channel and channelized optical carrier (ocs) interfaces. a dmu can support the following types and speeds of pos framers: oc-3c, oc-12, oc-12c, oc-48, and oc-48c. each dmu allows connection to oc-3c, oc-12, or oc-12c framers. to provide an oc-48 link, all four dmus must be attached to a single framer with each dmu providing four oc-3c channels or one oc-12c channel to the framer. to provide an oc-48 clear channel (oc-48c) link, dmu a must be configured to attach to a 32-bit framer and the other three dmus, although configured for oc-48 mode, are disabled except for their interface pins. table 3-4 shows the three dmu configuration modes: oc-3c, oc-12c, and oc-48c. when configured for oc-3c, the dmu assumes that it must poll the four possible attached ports before transferring data. if the dmu is configured for either oc-12c or oc-48c, the dmu does not poll the framer. figure 3-7 shows a configuration in which the pmm has oc-3c, oc-12, and oc-12c connections operating simultaneously. each of the four dmus has been configured to operate as single port or as four ports. each of the four dmus supports an 8-bit data interface in both the ingress and egress directions. figure 3-8 shows an oc-48 configuration. each dmu is configured to operate in a single port mode and to produce either a single oc-12c channel or four oc-3c channels. each dmu supports an 8-bit data interface in both the ingress and egress directions. table 3-4. dmu and framer configurations dmu configuration mode networks supported via pos framer oc-3c (8-bit mode) 4 x oc-3c per dmu or 1 x oc-12c per dmu 1 x oc-48c (all 4 dmus required) oc-12c (8-bit mode) 1 x oc-12c per dmu or 1 x oc-48c (all 4 dmus required) oc-48c (32-bit mode) 1 x oc-48c (dmu a in 32-bit mode) figure 3-7. oc-3c / oc-12 / oc-12c configuration framer framer framer framer 4xoc-3c connections oc-12c oc-12 4xoc-3c connections 4ports dmu - a np4gs3 dmu - b dmu - c dmu - d 4ports 4ports 1port connection connection oc-3c mode oc-3c mode oc-12c mode oc-3c mode 4 x logical channels 4 x logical channels 4 x logical channels 1 x logical channels
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 121 of 554 - figure 3-9 shows an oc-48c configuration. dmu a is configured to operate in a single-port 32-bit mode to support oc-48c. dmus b, c, and d must be configured for oc-48 mode and their data ports routed to dmu a. except for the i/os, dmus b, c, and d are not used in this configuration. the attached framer must also be configured for a 32-bit data interface. figure 3-8. oc-48 configuration figure 3-9. oc-48c configuration framer oc-48 connection 1port dmu - a np4gs3 dmu - b dmu - c dmu - d 1port 1port 1port oc-3c mode oc-12c mode or oc-3c mode oc-12c mode or oc-3c mode oc-12c mode or oc-3c mode oc-12c mode or 4 oc-3c 1 oc-12c or channels 4 oc-3c 1 oc-12c or channels 4 oc-3c 1 oc-12c or channels 4 oc-3c 1 oc-12c or channels dmu - a dmu - b dmu - c framer oc-48c connection 1port np4gs3 dmu - d oc-48c mode oc-48 mode oc-48 mode oc-48 mode
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 122 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 3.2.1 pos timing diagrams the following figures show timing diagrams for the different pos modes: oc-3c - figure 3-10: receive pos8 interface timing for 8-bit data bus (oc-3c) on page 122 - figure 3-11: transmit pos8 interface timing for 8-bit data bus (oc-3c) on page 123 oc-12c - figure 3-12: receive pos8 interface timing for 8-bit data bus (oc-12c) on page 124 - figure 3-13: transmit pos8 interface timing for 8-bit data bus (oc-12c) on page 125 oc-48c - figure 3-14: receive pos32 interface timing for 32-bit data bus (oc-48c) on page 126 - figure 3-15: transmit pos32 interface timing for 32-bit data bus (oc-48c) on page 127 figure 3-10. receive pos8 interface timing for 8-bit data bus (oc-3c) rxclk rxeof rxval rxdata rxenb rxaddr rxpfa 1 2 3 4 5 6 7 8 9 101112131415161718 1920 0 p 1 p 2 p 3 p 0 p 1 p 2 dp 2 dp 0 dp 0 dp 0 dp 0 dp 0 dp 0 dp 0 p 3 p 0 p 2 p 3 p 0 p 1 p 2 p 3 p 0 p 1 p 2 p 3 p 0 p 1 p 2 xx xx xx xx 21 dp 2 xx xx 1. data is being received from port #0 while the dmu is polling. 2. in clock cycle #3 the framer asserts rxpfa indicating data is available. in this example the dmu is configured for two cycles of framer latency (bus_delay = 1), therefore it is port #2 responding with rxpfa(clock cycle #1). 3. in clock cycle #6 the dmu deasserts rxenb (negative active) to indicate it is starting the port selection process. 4. in clock cycle #8 the dmu puts p 2 on rxaddr. 5. in clock cycle #9 the dmu selects port p 2 by asserting rxenb. the port selected is the port whose address is on rxa- ddr the cycle before rxenb is asserted. 6. in clock cycle #16 the framer deasserts rxval (assumed that the framer was not ready to continue data transfer) and restarts data transfer to the dmu during clock cycle #17. 7. in clock cycle #18 the framer asserts rxeof marking the last transfer for the frame. 8. in clock cycle #19 the framer deasserts both rxeof and rxval completing the frame transfer. 9. in clock cycle #16 and #17 rxpfa is asserted indicating data is available on ports 0 and 1. 10. in clock cycle #20 the dmu starts the selection of the next port. dp 0 dp 2 xx dp 2 dp 2 dp 2 dp 2
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 123 of 554 figure 3-11. transmit pos8 interface timing for 8-bit data bus (oc-3c) txclk txsof txeof txdata txenb txaddr txpfa 1 2 3 4 5 6 7 8 9 101112131415161718 1920 0 p 1 p 2 p 3 p 0 p 1 p 2 dp 2 p 3 p 0 p 1 p 2 p 3 p 0 p 1 p 2 p 3 p 0 p 1 p 2 p 3 p 0 p 1 xx xx 21 dp 2 1. in this figure it is assumed that the framer has a 2-cycle delay before responding to the dmu and that the dmu has bus_delay = 1. 2. in clock cycle #3 the framer responds to the poll by asserting txpfa, this indicates that the fifo for ports 2 and 3 have room for data (2 clock cycle delay). 3. in clock cycle #6 the dmu starts transferring a new frame to port 2 by asserting txsof, asserting txenb (nega- tive active), and putting data on txdata. 4. in clock cycle #12 the dmu asserts txeof indicating the last transfer of the frame. 5. in clock cycle #13 the dmu deasserts txenb to indicate that the dmu is in the port selection process. 6. in clock cycle #16 the framer asserts txpfa indicating it has data for port 3. 7. in clock cycle #19 the dmu asserts txenb and selects port 3. 8. in clock cycle #19 the dmu asserts txsof indicating the start of a new frame for port 3 and places data on txdata. dp 2 xx dp 2 dp 3 xx xx xx xx xx xx dp 2 dp 2 xx xx xx dp 3 dp 2
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 124 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 figure 3-12. receive pos8 interface timing for 8-bit data bus (oc-12c) rxclk rxeof rxval rxdata rxaddr rxpfa 1 2 3 4 5 6 7 8 9 101112131415161718 1920 0 p 0 p 0 p 0 p 0 p 0 p 0 xx p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 data data 21 data 1. in clock cycle #1 the framer asserts rxpfa indicating it has data to transfer to the dmu (for oc-12c operation the framer can tie rxpfa to a logical 1 and control data transfers with rxval). 2. in clock cycle #2 the dmu asserts rxenb (negative active) selecting the port whose address was on rxaddr during clock cycle #1 (for oc-12c operation the dmu must provide rxaddr from the card). 3. in clock cycle #4 the framer asserts rxval and starts the data transfer to the dmu. data transfer continues as long as rxval is asserted. the framer must deassert rxval at the end of the frame, clock cycle #10, but can deassert and then reassert anytime within the frame. 4. in clock cycle #10 the dmu deasserts rxenb because rxeof was asserted at clock cycle #9. 5. in clock cycle #11 the dmu asserts rxenb because the framer has rxpfa asserted. 6. in clock cycle #15 the framer starts the transfer of the next frame by asserting rxval. the dmu accepts data as early as clock cycle #12 and will wait (no limit) until the framer asserts rxval before accepting data. 7. the framer can control data transfer using rxval and the dmu can control data transfer using rxenb. as noted above, rxpfa can remain asserted. data data xx data xx xx xx xx data data data data data xx xx data xx rxenb note : for oc-12c, rxaddr is not provided by the dmu. rxaddr must be provided by board logic. there is a two cycle delay between the assertion of rxenb and rxval, the dmu can accept a one cycle delay.
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 125 of 554 figure 3-13. transmit pos8 interface timing for 8-bit data bus (oc-12c) txclk txsof txeof txdata txaddr txpfa 1 2 3 4 5 6 7 8 9 101112131415161718 1920 0 p 0 p 0 p 0 xx p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 xx xx 21 dp 0 1. in clock cycle #0 the framer has txpfa asserted indicating it can accept data from the dmu. 2. in clock cycle #6 the dmu selects port 0 by asserting txenb (negative active). the dmu also asserts txsof indicating the beginning of the frame and places data on txdata. 3. in clock cycle #8 the framer indicates it can not accept more data by deasserting txpfa. the dmu requires 6 clock cycles to stop data transfer, hence data transfer stops after clock cycle #13 by deasserting txenb. 4. in clock cycle # 15 the framer asserts txpfa indicating that it can accept more data. 5. in clock cycle #19 the dmu asserts txenb indicating data is valid on txdata. 6. in clock cycle #20 the dmu asserts txeof indicating the last transfer of the frame. 7. in clock cycle #21 the dmu deasserts txeof and txenb. dp 0 xx dp 0 dp 0 xx xx xx dp 0 dp 0 dp 0 dp 0 dp 0 xx dp 0 txenb note : for oc-12c, txaddr is not provided by the dmu. txaddr must be provided by board logic. xx xx xx xx p 0 p 0 p 0 p 0
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 126 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 figure 3-14. receive pos32 interface timing for 32-bit data bus (oc-48c) rxclk rxeof rxval rxdata rxenb rxpadl rxpfa 1 2 3 4 5 6 7 8 9 10 11121314151617181920 0 03 00 word word word word word xxxx xxxx xxxx xxxx xxxx 0 000000 0001 0000 xxxx xxxx word word word word xxxx xxxx xxxx word word 00 rxaddr p 0 p 0 p 0 note : for oc-48c, rxaddr is not provided by the dmu. rxaddr must be provided by board logic. this figure shows a two clock 1. in clock cycle #0 the dmu is receiving data from the framer. 2. in clock cycle #1 the framer asserts rxeof indicating the last data transfer of the current frame. 3. in clock cycle #2 the framer deasserts rxeof and rxval marking the end of the frame, 4. in clock cycle #2 the dmu deasserts rxenb starting the port selection process. 5. in clock cycle #3 the dmu checks rxpfa (asserted) and asserts rxenb indicating it can accept more data. 6. in clock cycle #5 the data transfer from the framer to the dmu is started. the data transfer continues until clock cycle #15. the framer halts the transfer during cycles 9 and 10 by deaserting rxval. 7. in clock cycle #14 the framer asserts rxeof marking the end of the frame. 8. in clock cycle #15 the dmu deasserts rxenb. the dmu asserts rxenb in clock cycle #18 after detecting rxpfa asserted, indicating that the framer has data to transfer. deassertion of rxenb in clock cycle #18 selects port 0 again and data transfer begins in clock cycle #20. p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 cycle delay between deassertion of rxenb and the assertion of rxval. the dmu can accept a one cycle delay.
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 127 of 554 figure 3-15. transmit pos32 interface timing for 32-bit data bus (oc-48c) txclk txsof txeof txdata txenb txpadl txpfa 00 00 word word word xxxx word word word word word word 0 000002 0001 0000 xxxx xxxxwordwordword xxxx xxxx xxxx xxxx xxxx xxxx 00 1 2 3 4 5 6 7 8 9 1011121314151617181920 0 1. in clock cycle #2 the framer deasserts rxpfa, indicating it can not accept more data from the dmu. 2. in clock cycle #9 the dmu stops the data transfer in response to the deassertion of txpfa in clock cycle #2. 3. in clock cycle #9 the dmu deasserts txenb and begins its selection sequence. 4. in clock cycle #5 the framer asserts txpfa indicating it can again accept data. seven clock cycles later the dmu will assert txenb (clock cycle #12) and place the next data word on txdata. 5. in clock cycle #14 the dmu asserts txeof to indicate the end of the frame. the dmu also indicates the number of pad bytes in the last data transfer (in this example 3 data bytes were set in the last data transfer). 6. in clock cycle #15 the dmu deasserts txeof and deasserts txenb indicating the end of the frame. 7. in clock cycle #17 the framer deasserts txpfa indicating that it can not accept more data. note : for oc-12c, txaddr is not provided by the dmu. txaddr must be provided by board logic. txaddr p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0 p 0
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 128 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 3.2.2 pos counters table 3-5 on page 128 and table 3-6 on page 129 provide information about the counters maintained to support pos interfaces. many of the counters defined in these tables deal with long frames. a long frame is a packet whose byte count exceeds the value held in the packet over sonet maximum frame size (pos_max_fs) register (see configuration on page 437). the byte count includes the cyclic redundancy check (crc) octets. reception of packets meeting the criteria for long frames is aborted by the hardware. abort is indicated in the ingress port control block and in the ingress frame control block. if the packet has not been forwarded by the picocode, the picocode must enqueue the packet to the ingress discard queue. if the packet has been forwarded, then the egress hardware will discard the frame. further reception at the mac is inhibited until the next packet starts. table 3-5. receive counter ram addresses for ingress pos mac (page 1 of 2) port name counter number description port 0 long frames received x ? 00 ? total number of frames received that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 01 ? total number of frames received that had a length of the value contained in the pos maximum frame size register (pos_max_fs) or less, but had a bad crc total good frames received x ? 02 ? total number of frames (excluding frames with bad crc, and long frames) received receive errors x ? 03 ? total number of frames received in which the framer detected an error and asserted the rx_err signal total octets received (including frames with errors) x ? 10 ? total number of octets of data (including those in bad frames) received on the network port 1 long frames received x ? 04 ? total number of frames received that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 05 ? total number of frames received that had a length of the value contained in the pos maximum frame size register (pos_max_fs) or less, but had a bad crc total good frames received x ? 06 ? total number of frames (excluding frames with a bad crc, and long frames) received receive errors x ? 07 ? total number of frames received in which the framer detected an error and asserted the rx_err signal total octets received (including frames with errors) x ? 11 ? total number of octets of data (including those in bad frames) received on the network
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 129 of 554 port 2 long frames received x ? 08 ? total number of frames received that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 09 ? total number of frames received that had a length of the value contained in the pos maximum frame size register (pos_max_fs) or less, but had a bad crc total good frames received x ? 0a ? total number of frames (excluding frames with a bad crc, and long frames) received receive errors x ? 0b ? total number of frames received in which the framer detected an error and asserted the rxerr signal total octets received (including frames with errors) x ? 12 ? total number of octets of data (including those in bad frames) received on the network port 3 long frames received x ? 0c ? total number of frames received that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 0d ? total number of frames received that had a length of the value contained in the pos maximum frame size register (pos_max_fs) or less, but had a bad crc total good frames received x ? 0e ? total number of frames (excluding frames with a bad crc, and long frames) received receive errors x ? 0f ? total number of frames received in which the framer detected an error and asserted the rx_err signal total octets received (including frames with errors) x ? 13 ? total number of octets of data (including those in bad frames) received on the network table 3-6. transmit counter ram addresses for egress pos mac (page 1 of 2) port name counter number description port 0 long frames transmitted x ? 00 ? total number of frames transmitted that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 01 ? total number of frames transmitted that had a length of the value contained in the pos maximum frame register (pos_max_fs) or less, but had bad crc total good frames transmitted x ? 02 ? total number of frames (excluding frames with a bad crc, and long frames) transmitted transmit underruns x ? 03 ? total number of frames attempted to be transmitted but an under- runoccurredinthenp4gs3 total octets transmitted (including frames with errors) x ? 10 ? total number of octets of data (including those in bad frames) transmitted on the network aborted frames x ? 14 ? total number of frames that had either link status pointer parity errors, or had a header qsb field point beyond eof, and were aborted by the egress pmm table 3-5. receive counter ram addresses for ingress pos mac (page 2 of 2) port name counter number description
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 130 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001 port 1 long frames transmitted x ? 04 ? total number of frames transmitted that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 05 ? total number of frames transmitted that had a length of the value contained in the pos maximum frame register (pos_max_fs) or less, but had bad crc total good frames transmitted x ? 06 ? total number of frames (excluding frames with a bad crc, and long frames) transmitted transmit underruns x ? 07 ? total number of frames attempted to be transmitted, but an under- runoccurredinthenp4gs3 total octets transmitted (including frames with errors) x ? 11 ? total number of octets of data (including those in bad frames) transmitted on the network aborted frames x ? 15 ? total number of frames that had either link status pointer parity errors, or had a header qsb field point beyond eof, and were aborted by the egress pmm port 2 long frames transmitted x ? 08 ? total number of frames transmitted that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 09 ? total number of frames transmitted that had a length of the value contained in the pos maximum frame register (pos_max_fs) or less, but had bad crc total good frames transmitted x ? 0a ? total number of frames (excluding frames with a bad crc, and long frames) transmitted transmit underruns x ? 0b ? total number of frames attempted to be transmitted, but an under- runoccurredinthenp4gs3 total octets transmitted (including frames with errors) x ? 12 ? total number of octets of data (including those in bad frames) transmitted on the network aborted frames x ? 16 ? total number of frames that had either link status pointer parity errors, or had a header qsb field point beyond eof, and were aborted by the egress pmm port 3 long frames transmitted x ? 0c ? total number of frames transmitted that were longer than the value contained in the pos maximum frame size register (pos_max_fs) including crc octets frames with bad crc x ? 0d ? total number of frames transmitted that had a length of the value contained in the pos maximum frame register (pos_max_fs) or less, but had bad crc total good frames transmitted x ? 0e ? total number of frames (excluding frames with a bad crc, and long frames) transmitted transmit underruns x ? 0f ? total number of frames attempted to be transmitted but an under- runoccurredinthenp4gs3 total octets transmitted (including frames with errors) x ? 13 ? total number of octets of data (including those in bad frames) transmitted on the network aborted frames x ? 17 ? total number of frames that had either link status pointer parity errors, or had a header qsb field point beyond eof, and were aborted by the egress pmm table 3-6. transmit counter ram addresses for egress pos mac (page 2 of 2) port name counter number description
ibm powernp np4gs3 preliminary network processor np3_dl_sec03_pmm.fm.08 may 18, 2001 physical mac multiplexer page 131 of 554 3.2.3 pos support table 3-7 lists the features and standards supported by a dmu in the different pos modes. table 3-7. pos support feature pos mode oc-3c & oc-12c (8-bit) oc-48c (32-bit) supports quad-port oc-3c connections, quad-port oc-12 connections, or single-port oc-12c connections x supports a single port oc-48c connection x compatible with flexbus ? 3 - quad- 8 bit bus operational mode x compatible with flexbus ? 3 - single - 32 bit bus operational mode x programmable crc insertion on a frame basis. with crc insertion disabled, the mac transmits frames as is (suitable for switch environments). with crc insertion enabled, the mac calculates and inserts the crc. xx the minimum frame length the np4gs3 can handle at a sustainable rate is 18 bytes. smaller frames can be handled, but will consume the bandwidth of an 18-byte frame. an indirect back-pressure between the proces- sor and the framer prevents frames from being lost, unless many small frames are received back-to-back. xx provides the following nine tx counters for testing and debugging purposes: number of bytes transmitted / received number of frames transmitted / received number of long frames transmitted / received number of frames with bad crc transmitted / received number of receive errors xx
ibm powernp np4gs3 network processor preliminary physical mac multiplexer page 132 of 554 np3_dl_sec03_pmm.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec04_ieds.fm.08 may 18, 2001 ingress enqueuer / dequeuer / scheduler page 133 of 554 4. ingress enqueuer / dequeuer / scheduler 4.1 overview the ingress enqueuer / dequeuer / scheduler (ingress eds) interfaces with the physical mac multiplexer (pmm), the embedded processor complex (epc), and the ingress switch interface (ingress swi). frames that have been received on the data mover unit (dmu) interface are passed through the pmm to the ingress eds. the ingress eds collects the frame data in its internal data store; when it has received sufficient data, the ingress eds enqueues the frame to the epc for processing. the ingress eds does not need to receive the entire frame before it enqueues the data (that is, it can operate in cut-through mode). once the epc processes the frame, it provides forwarding and quality of service (qos) information to the ingress eds. the ingress eds then invokes hardware configured flow control mechanisms and then either discards the frame or places it into a queue to await transmission. the ingress eds schedules all frames that cross the ingress swi. after it selects a frame, the ingress eds passes the frame data to the ingress swi, which segments the frame into data cells and sends them on to the data-aligned synchronous link (dasl) interface (which, in turn, passes the data to the switch fabric).
ibm powernp np4gs3 network processor preliminary ingress enqueuer / dequeuer / scheduler page 134 of 554 np3_dl_sec04_ieds.fm.08 may 18, 2001 4.2 operation figure 4-1 illustrates the operations of the ingress eds discussed in this section. the frame demultiplexer (fd) receives incoming frames from the pmm. an fcb is allocated on-the-fly from the fcb free queue as each new frame starts. the fd writes each frame as a list of chained data buffers into its internal ingress data store. the buffer addresses are dequeued on-the-fly from the bcb free queue, which is the linked list of all currently available data buffer addresses. the epc can process the frames under one of the following conditions: cut-through mode is disabled and the frame has been completely received. cut-through mode is enabled and sufficient data has been received. the quantity of sufficient data in cut-through mode is programmable in increments of 64 bytes. the first 64 bytes of a frame, in most cases, contain the information to be used in layer 2, layer 3, and layer 4 forwarding. when a frame is ready for processing by the epc, the ingress eds enqueues its fcb to an input queue: either the gc queue (for control frames) or the gd queue (for data frames). figure 4-1. logical organization of ingress eds data flow management ... uc round-robin scheduler sof rings priority high unicast tb-run queues 01 62 63 multicast tb-run queues to switch fabric to any sof ring / from epc to epc frame demultiplexer from pmm fcb free queue bcb free queue gc queue gd queue discard tb-run queues output scheduler mc/discard scheduler tb# ... low high low high low high low high low high low
ibm powernp np4gs3 preliminary network processor np3_dl_sec04_ieds.fm.08 may 18, 2001 ingress enqueuer / dequeuer / scheduler page 135 of 554 the epc is responsible for recognizing that a frame is ready to be processed, dequeueing and processing it, and giving it back to the ingress eds (by enqueueing to a tdmu queue) for forwarding to the ingress swi. the epc returns each frame in an ingress enqueue operation to a specific target data mover unit (tdmu) queue (as determined by the routing table lookup). at this point in the flow, the ingress eds flow control logic determines if the frame should be enqueued into the ingress scheduler (that is, added to the tdmu queue) or if it should be discarded. see section 4.3 ingress flow control on page 138 for details. tdmu queues are linked in start-of-frame (sof) rings. an sof ring can hold up to four tdmu queues. to maintain optimal ring structure, only non-empty tdmu queue are included in an sof ring. each sof ring is associated with a particular tb-run queue, creating an sof ring / tb-run queue set that provides a fair initiation of frame transmission to all dmus of a target network processor. fairness is achieved by a round-robin mechanism that successively selects each tdmu queue in the sof ring. when a tdmu queue is selected, the frame currently at its head is dequeued from the tdmu queue and enqueued in a tb-run queue (the actual entry-point of data movement to the ingress swi) once sufficient data for forwarding has been received. for each target blade (tb) (the frame ? s destination across the switch), there are two sof ring / tb-run queue sets: one for unicast high priority traffic and one for unicast low priority traffic. four other sets are reserved for multicast high and low priority traffic, and for high and low priority discard traffic. all six sets are scheduled for service at each switch cell time. for each priority plane, round-robin scheduling is performed between the 64 unicast tb-run queues and the multicast and discard traffic. a second level of scheduling alternates between the selected unicast candidate and the candidate from the multicast and discard tb-run queues. these schedulers are controlled by the status (empty or non-empty) of each tb-run queue, and by the flow control indication received from the switch interface (swi). congestion information from the swi is comprised of three ? grants ? : absolute priority is enforced: the low priority candidate is used only if there is no high priority candidate present at the time. the cut-through function is supported by queueing partial frames that contain enough data to fill a switch cell. if, after scheduling to the switch, enough data remains in the frame to fill another switch cell, then the frame is kept in the tb-run queue. if all the frame data has not been received from the pmm, but there is insufficient data in the data store to fill the next switch cell, the fcb is removed from the tb-run queue and left ? floating ? (not part of any queue). when the frame demultiplexer detects that there is enough data to fill a switch cell, or the last byte of the frame is received, it re-enqueues the frame in its tb-run queue. if all the frame data has been received, the fcb stays in the tb-run queue until all the data has been moved to the swi. at that point, the fcb is sent to the fcb free queue. master grant (master_grant_a/b) loss of master grant of a priority eliminates all queues of the indicated priority from the round-robin selection, with the exception of the discard queues. multicast grant (mc_grant_a/b) loss of multicast grant of a priority eliminates all multicast queues of the indicated priority from the round-robin selection. output queue grant (oqg) loss of oqg for a target blade and priority eliminates that target blade and priority queue from the round-robin selection (see section 5.4.2 output queue grant reporting on page 153).
ibm powernp np4gs3 network processor preliminary ingress enqueuer / dequeuer / scheduler page 136 of 554 np3_dl_sec04_ieds.fm.08 may 18, 2001 as can be seen from the above description, the i-eds may have multiple frames "in flight". the number of frames that may be simultaniously in a state of transmission is controlled by the configuration of the tb mode register (tb_mode) (see section 13.3 tb mode register (tb_mode) on page 446). this value controls the number of correlators available (see table 5-1: cell header fields on page 145). the i-eds maintains corre- lator free pools for unicast and multicast frames and for each priority (total of four free pools). note: for packed frames, two correlators are required, one for the frame completing and one for the frame that is starting (see table 5-1: cell header fields on page 145 and table 5-2: frame header fields on page 148). 4.2.1 operational details this section provides more details about some of the operations described in section 4.2 . figure 4-2 details the structure of the start-of-frame (sof) rings acting as multiple-port entry points to the target data mover unit (tdmu) queues. an sof ring is a circular linked list of tdmu queues dynamically inserted into and removed from the ring. note that the multicast and discard sof rings contain only a single tdmu queue. figure 4-3: ingress eds logical structure on page 137 illustrates the various logical structures of the eds that are discussed in the rest of this section. a frame (or packet) is stored in the internal ingress data store in the form of a linked list of data buffers. each data buffer is a 64-byte area of the ingress data store. the ingress data store has 2048 data buffers avail- able, each with a unique identifier value (0 through 2047). data buffers are chained by next buffer control block address (nba) pointers located in buffer control blocks (bcbs). one bcb is associated with each data buffer and shares the same identifier value (there are 2048 bcbs). bcbs are physically located in an independent imbedded memory. figure 4-2. sof ring structure four tdmu queues round-robin scheduler to tb-sof queue (maximum for sof queue)
ibm powernp np4gs3 preliminary network processor np3_dl_sec04_ieds.fm.08 may 18, 2001 ingress enqueuer / dequeuer / scheduler page 137 of 554 each frame is represented by a frame control block (fcb) that identifies the new frame by data location (the address of the first buffer in the data store) and control information (the source port from which this frame originated). 2048 fcbs are available in an independent embedded memory. each fcb contains parameters associated with the frame, such as the two pointers shown in figure 4-3 : the control block address (cba) points to the first data buffer of a frame (for frames whose transmission to the switch has not started), or the data buffer ready to be transmitted (for frames whose transmission to the ingress swi has already started). the next fcb address (nfa) points to the fcb of the next frame in the queue. the fcb also maintains information about the amount of one frame ? s data currently in the ingress data store. as the data is moved out of the data store across the swi, the fcb accounting information reflects the reduction in data present in the data store. at the same time, more of the frame ? s data may still be arriving from the pmm, and the accounting information in the fcb will reflect the increase of data in the data store. frames are chained in queues which consist of fifo-linked lists of fcbs and the queue control blocks tdmu queues. a tdmu queue contains the information necessary to manage the linked list of fcbs, particularly: head pointer: points to the first fcb in the queue tail pointer: points to the last fcb in the queue count: indicates how many fcbs are present in the queue next qcb address (nqa) pointer: points to the next tdmu queue. this pointer enables the chaining of tdmu queues and is the mechanism for creating the sof ring and tdmu queues. figure 4-3. ingress eds logical structure tdmu queue head pointer tail pointer count bcb data buffer nqa pointer tdmu queue chaining used for tdmu queues in sof ring frame cba data buffer data buffer tdmu queue head pointer tail pointer count nqa pointer tdmu queue head pointer tail pointer count nqa pointer linked list nba nba bcb bcb nfa nfa nfa fcb fcb fcb nfa fcb
ibm powernp np4gs3 network processor preliminary ingress enqueuer / dequeuer / scheduler page 138 of 554 np3_dl_sec04_ieds.fm.08 may 18, 2001 4.3 ingress flow control flow control (whether to forward or discard frames) in the network processor is provided by hardware assist mechanisms and by picocode that implements a selected flow control algorithm. in general, flow control algo- rithms require information about the congestion state of the data flow, the rate at which packets arrive, the current status of the data store, the current status of target blades, and so on. a transmit probability for various flows is an output of these algorithms. the network processor implements flow control in two ways: flow control is invoked when the frame is enqueued to a start-of-frame (sof) queue. the hardware assist mechanisms use the transmit probability along with tail drop congestion indicators to determine if a forwarding or discard action should be taken during frame enqueue operation. the flow control hardware uses the picocode ? s entries in the ingress transmit probability memory to determine what flow control actions are required. flow control is invoked when frame data enters the network processor. when the ingress ds is suffi- ciently congested, these flow control actions discard either all frames, all data frames, or new data frames. the thresholds that control the invocation of these actions are bcb_fq threshold for guided traffic, bcb_fq_threshold_0, and bcb_fq_threshold_1 (see table 4-1: flow control hardware facili- ties on page 139 for more information). 4.3.1 flow control hardware facilities the hardware facilities listed in table 4-1 are provided for the picocode's use when implementing a flow control algorithm. the picocode uses the information from these facilities to create entries in the ingress transmit probability memory. the flow control hardware uses these entries when determining what flow control actions are required.
ibm powernp np4gs3 preliminary network processor np3_dl_sec04_ieds.fm.08 may 18, 2001 ingress enqueuer / dequeuer / scheduler page 139 of 554 table 4-1. flow control hardware facilities name definition access bcb free queue (bcb_fq) control block register provides an instantaneous count of the number of free buffers available in the ingress data store. cab bcb_fq threshold for guided traffic threshold for bcb_fq. when bcb_fq < bcb_fq_gt_th, no further buffers are allocated for incoming data. cab bcb_fq_threshold_0 threshold for bcb_fq. when bcb_fq < bcb_fq_threshold_0, no fur- ther buffers are allocated for incoming user traffic. guided traffic can still allocate new buffers. when this threshold is violated, an interrupt (class 0, bit 0) is signaled. cab bcb_fq_threshold_1 threshold for bcb_fq. when bcb_fq < bcb_fq_threshold_1, no fur- ther buffers are allocated for new incoming user traffic. guided traffic and packets already started can still allocate new buffers. when this threshold is violated, an interrupt (class 0, bit 1) is signaled. cab bcb_fq_threshold_2 threshold for bcb_fq. when bcb_fq < bcb_fq_threshold_2, an inter- rupt (class 0, bit 2) is signaled. cab flow control ingress free queue threshold (fq_p0_th) threshold for bcb_fq used when determining flow control actions against priority 0 traffic. when bcb_fq < fq_p0_th, the flow control hardware discards the frame. cab flow control ingress free queue threshold (fq_p1_th) threshold for bcb_fq used when determining flow control actions against priority 1 traffic. when bcb_fq < fq_p1_th, the flow control hardware discards the frame. cab bcb fq arrival count arrival rate of data into the ingress data store. this counter increments each time there is a dequeue from the bcb free queue. when read by pico- code via the cab, this counter is set to 0 (read with reset). cab ingress free queue count exponentially weighted moving average calculated ewma of the bcb fq (bcb_fq_ewma). cab flow control ingress free queue threshold (fq_sbfq_th) threshold for bcb_fq. when bcb_fq < fq_sbfq_th, the i_freeq_th is set to 1. this may be used by external devices assisting in flow control. cab localtb_mc_status_0 threshold status of the priority 0 tdmu queues and the multicast queue. the count and the threshold are compared four times per target blade (once per dmu). the results are combined into a single result on the target blade, and the corresponding bit is set in localtb_mc_status_0. if tdmu_qcb.qcnt > tdmu_qcb.th, the bit is set to 1; if not, it is set to 0. hardware only tdmu queue control block (dmu_qcb) threshold for number of frames enqueued to the tdmu. these values are used to compare against the queue count when setting localtb_mc_status_0 bits. cab remote tb status 0 this 64-bit register contains the congestion status of all remote target blades ? egress data stores. the congestion status of each remote target blade is the result of a comparison between the configured threshold and the ewma of the offered rate of priority 0 traffic (see table 6-1: flow con- trol hardware facilities on page 168). this information is collected via the res_data i/o. hardware only remote tb status 1 this 64-bit register contains the congestion status of all remote target blades ? egress data stores. the congestion status of each remote target blade is the result of a comparison between the configured threshold and the ewma of the offered rate of priority 1 traffic (see table 6-1: flow con- trol hardware facilities on page 168). this information is collected via the res_data i/o. hardware only
ibm powernp np4gs3 network processor preliminary ingress enqueuer / dequeuer / scheduler page 140 of 554 np3_dl_sec04_ieds.fm.08 may 18, 2001 4.3.2 hardware function 4.3.2.1 exponentially weighted moving average (ewma) the hardware generates ewma values for the bcb_fq count, thus removing the burden of this calculation from the picocode. in general, ewma for a counter x is calculated as follows: ewma_x = (1 - k) * ewma_x + k * x this calculation occurs for a configured sample period and k {1,1/2,1/4,1/8}. 4.3.2.2 flow control hardware actions when the picocode enqueues a packet to be transmitted to a target blade, the flow control hardware exam- ines the state of the fq_p0_th and fq_p1_th threshold statuses and the priority of the enqueued packet to determine what flow control action is required. if the fcinfo field of the fcbpage of the enqueued packet is set to x'f', flow control is disabled and the packet is forwarded. for priority 0 packets, if fq_p0_th or localtb_mc_status_0 or remote tb status 0 is exceeded, the packet is discarded (tail drop discard). the picocode must set up a counter block to count these discards. for priority 1 packets, if fq_p1_th is exceeded, the packet is discarded (tail drop discard). otherwise, the transmit probability table is accessed and the value obtained is compared against a random number ( {0 ...1}) generated by the hardware. when the transmit probability is less than the random number, the packet is discarded. the index into the transmit probability table is qqqtcc, where: qqq quality of service (qos) class taken from the dscp (ingress fcbpage tos field bits 7:5) t remote target blade (tb) status; one bit corresponding to the target blade (zero when the frame is multicast.) cc diffserv code point (dscp) assigned color (ingress fcbpage fcinfo field bits 1:0)
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 141 of 554 5. switch interface the switch interface (swi) supports two high-speed data-aligned synchronous link (dasl) interfaces to attach the np4gs3 to itself, to other network processors, or to an external switch fabric. (for purposes of discussion, this text assumes that the connection is to a switch fabric.) each dasl link supports up to 3.25 to 4 gbps throughput. the dasl links can be used in parallel (for example, to interconnect two network proces- sors as dual network processors) or with one as the primary switch interface and the other as an alternate switch interface (for increased system availability). the dasl interfaces enable up to 64 network processors to be interconnected using an external switch fabric. the swi supports the following: twodaslswitchinterfaces simultaneous transfer of cells on the swi in both ingress and egress directions building a cell header and frame header for each frame segmenting a frame into 64-byte switch cells cell packing the swi ? s ingress side sends data to the switch fabric and its egress side receives data from the switch fabric. the dasl bus consists of four paralleled links: one master and three slaves. the np4gs3 supports two dasl channels (a and b). an arbiter sits between the ingress switch data mover (i-sdm) and ingress switch cell interfaces sci_a-1 and sci_b-1. the arbiter directs the data traffic from the i-sdm and probe to the appropriate i-sci, based on switch_bna device input and the settings in the dasl configuration register (see 13.29.1 dasl configuration register (dasl_config) on page 502). figure 5-1 on page 142 shows the main functional blocks of the swi.
ibm powernp np4gs3 network processor preliminary switch interface page 142 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 the main units, described in following sections, are: i-sdm: ingress switch data mover i-sci: ingress switch cell interface dasl: data-aligned synchronous links e-sci: egress switch cell interface e-sdm: egress switch data mover figure 5-1. switch interface functional blocks switch fabric a i-sci a arb dasl tx dasl rx e-sci a e-sdm a switch interface - egress from ingress eds to egress eds (e-swi) switch interface - ingress (i-swi) dasl component a switch fabric b b b b i-sdm probe b sdc
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 143 of 554 5.1 ingress switch data mover the ingress switch data mover (i-sdm) is the logical interface between the ingress enqueuer / dequeuer / scheduler ? s (eds) frame data flow and the switch fabric ? s cell data flow. the i-sdm segments the frames into cells and passes the cells to the i-sci. the i-sdm also provides the following frame alteration functions: vlan insert or overlay. the four bytes comprising the vlan type field (x ? 8100 ? ) and the tag control field are placed after the 12th byte of the frame. n-byte delete. bytes may be deleted from the incoming frame prior to being sent to the swi. the ingress eds controls the i-sdm data input by requesting the i-sdm to transmit data to the switch. to do this, the ingress eds gives the i-sdm the control information necessary to transmit one segment (either 48 or 58 bytes) of data from the selected frame. the control information consists of a buffer control block (bcb) address, frame control block (fcb) record contents, and the bcb address of the next frame going to the same switch destination (used for packing). with this information, the i-sdm builds a switch cell under a quadword-aligned format compatible with the data structure in the egress data store. the logical switch cell structure contains three fields: cell header, frame header, and data. the first cell of a frame contains a cell header, a frame header, and 48 bytes of frame data. following cells of a frame contain a cell header followed by 58 bytes of frame data. the final cell of a frame contains the cell header, remaining bytes of frame data, and any necessary bytes to pad out the remainder of the cell. to increase the effective bandwidth to the switch, the network processor implements frame packing where the remaining bytes of a final cell contain the frame header and beginning bytes of the next frame. packing is used with unicast frames when the target blade and priority of the next frame are the same as the target blade and priority of the preceding frame. in this case, the packed cell contains a 6-byte cell header, the remaining data bytes of the first frame, a 10-byte frame header of the second frame, and some data bytes of the second frame. packing of frames occurs on 16-byte boundaries within a cell. to complete the logical cell, the i-sdm assembles control information from the fcb that was sent from the ingress eds, and utilizes data bytes read from the ingress data store buffers. logical cells are passed to the i-sci using a grant handshake.
ibm powernp np4gs3 network processor preliminary switch interface page 144 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 5.1.1 cell header a cell header is a 6-byte field holding control information for the cell. the first three bytes are used by the switch fabric for routing and flow control. the last three bytes are used by the target network processor. a cell header is sent with each cell across the attached switch fabric. figure 5-2 illustrates the format of a cell header; table 5-1 provides the field definitions. figure 5-2. cell header format msb byte 0 byte 1 byte 2 64-blade mode 16-blade mode byte 3 byte 4 byte 5 sb r r correlator qt endptr r r r qt st rlow (3:0) correlator endptr sb low (3:0) sb hi (5:4) st target blade ucnmc qualifier ucnmc msb
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 145 of 554 table 5-1. cell header fields (page1of2) field name definition notes qualifier this 8-bit field indicates the type of cell that is being transmitted. 7,2 cell format value. the type of cell format used for this cell. ? 10 ? frame format. cell data is from a single frame (i.e., not packed, see below). ? 11 ? packed frame format. the st field must be set to '01', indicating the end of the current frame. the endptr field indicates the location of the last byte of the cur- rent frame. the next frame starts on the next qw boundary. a frame may be packed only if it is unicast format, has the same priority and tb destination as the current frame, and is 49 bytes or larger. the correlator in the cell header cannot be the same value as the correlator in the frame header of the next frame found in the packed cell. all other values are reserved. 6 switch cell header parity. this even parity bit covers the qualifier and the tb fields. when a parity error is detected, the cell is discarded and the event is counted. even parity is achieved when the number of bits in the cell header that contain a ? 1 ? is even. 5:4 data cell indicator. ? 00 ? idle cell ? 11 ? data cell all other values are reserved. 3 reserved. set to ? 0 ? . 1:0 cell priority ? 00 ? highest priority - used only for port mirroring traffic. ? 01 ? high priority traffic ? 10 ? low priority traffic ? 11 ? reserved 1 (cell format value) 3 (cell priority) target blade target blade address (16-bit field). encoding of this field depends on the target blade mode. 16-blade the target blade field is a bit map of the destination target blades. target blade 0 corresponds to the msb of the target blade field. 64-blade valid unicast target blade field encodes are 0 through 63. multicast encodes are in the range of 512 through 65535. 2 ucnmc unicast - multicast indicator. this 1-bit field indicates the format of the frame carried in this cell. ? 0 ? multicast format. guided traffic must be sent in multicast format. ? 1 ? unicast format. 3 st(1:0) frame state indicator (2-bit field). provides information about the status of the frame currently being carried in the cell. 00 continuation of current frame (the cell contains a middle portion of a frame, but no start or end of a frame) 01 end of current frame. this code point must be used when the cell format value is '11' (packed frame format). (the cell contains the end of frame if "frame format" is indi- cated by the cell format value, or the end of one frame and the beginning of another frame if "packed frame format" is indicated.) 10 start of new frame (the cell contains the start of a frame, but no end of a frame) 11 start and end of new frame. used for frames 48 bytes or smaller. this code point is also used to indicate a reassembly abort command. abort is indicated when the end pointer field value is ? 0 ? . an aborted frame is discarded by the e-eds. at the point of reassembly, cells for a frame are expected to follow a start, continuation (may not be present for short frames) and end sequence. hardware in the e-eds detects when this sequence is not followed and reports these errors in the reassembly sequence error count register. frames with reassembly sequence errors are discarded. sb source blade address. the size of this field depends on the blade operational mode and indi- cates the value of the source blade. 4 1. qualifier bits 1 and 2 are used by the np4gs3's reassembly logic. 2. on egress, this field contains the output grant status information used by the ingress scheduler. 3. this field is used by the egress reassembly logic in selection of the reassembly control block. 4. on egress, this field is copied into the correlator field in the frame header when stored in the egress data store. it is also used by the egress reassembly logic in selection of the reassembly control block.
ibm powernp np4gs3 network processor preliminary switch interface page 146 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 correlator the size of this field depends on the blade operational mode. see hardware reference man- ual section 13.3 tb mode register (tb_mode). 16-blade correlator values 0 - 63 are used for unicast traffic. correlator values 0-31 are used for multicast traffic. 64-blade correlator values 0 - 15 are used for unicast traffic. correlator values 0-7 are used for multicast traffic. 3 qt(1:0) queue type (2-bit field). used to determine the required handling of the cell and the frame. bits 1:0 description 0x user traffic that is enqueued into a data queue (grx or gbx). 1x guided traffic that is enqueued into the guided frame queue. x0 cell may be dropped due to switch congestion. this may be used by the attached switch. np4gs3 hardware does not take any action based on this value. x1 cell may not be dropped due to switch congestion. this may be used by the attached switch. np4gs3 hardware does not take any action based on this value. qt(0) is set to 1 by the i-swi hardware when fc info field of the frame header is set to x ? f ? . endptr end pointer (6-bit field). the endptr field is a byte offset within the cell which indicates the loca- tion of the last data byte of the current frame in the cell when the st field indicates end of cur- rent frame (st = ? 01 ? or ? 11 ? ). valid values of the endptr field are in the range of 6 through 63. an endptr value of 0 is used only with an st value of '11' to indicate an abort command. values of 1 through 5 are reserved. in all other cases (st = ? 00 ? or ? 10 ? ), this field contains sequence checking information used by the frame reassembly logic in the network processor. the sequence checking information consists of a sequence number placed into this field. the first cell of a frame is assigned a sequence number of 0. in a packed cell, the sequence num- ber is not placed into the endptr field since the field must contain the end pointer for the pre- ceding frame; the next cell for this frame will contain a sequence number of 1. sequence numbers increment from 0 to 63 and will wrap if necessary. r reserved field, transmitted as ? 0 ? . should not be modified or checked by the switch fabric. table 5-1. cell header fields (page2of2) field name definition notes 1. qualifier bits 1 and 2 are used by the np4gs3's reassembly logic. 2. on egress, this field contains the output grant status information used by the ingress scheduler. 3. this field is used by the egress reassembly logic in selection of the reassembly control block. 4. on egress, this field is copied into the correlator field in the frame header when stored in the egress data store. it is also used by the egress reassembly logic in selection of the reassembly control block.
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 147 of 554 5.1.2 frame header a frame header is a 10-byte field containing control information used by the target network processor and is sent once per frame. a frame header can immediately follow the cell header, or with packing enabled, it can also be placed at the start of the 2nd (d10), 3rd (d26), or 4th (d42) qw of the cell. figure 5-3 illustrates the format of a frame header; table 5-2 provides the field definitions. figure 5-3. frame header format 1. see description of the correlator field in frame header fields on page 148. (1:0) lid blade 64 r r r r r r blade 16 byte 0 byte 1 byte 2 fhe(31:8) byte 6 byte 7 byte 8 fhe(7:0) byte 9 stake (7:6) mc uc mc uc mc uc (1:0) lid stake(5:0) stake(5:0) mid lid(20:2) sp sp byte 4 byte 5 byte 3 correlator 1 with sb) (overwritten correlator 1 with sb) (overwritten dsu dsu ucnmc fhf fhf fc info
ibm powernp np4gs3 network processor preliminary switch interface page 148 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 table 5-2. frame header fields field name description notes ucnmc unicast - multicast indicator. this 1-bit field indicates the format of the frame header. ? 0 ? multicast format. ? 1 ? unicast format. 1 fc info flow control information (4-bit field). indicates the type of connection used for this frame. con- nection type encoding is used by the np4gs3 ? s flow control mechanisms. 2 lid lookup identifier. used by the egress processing to locate the necessary information to for- ward the frame to the appropriate target port with the correct qos. 3 mid multicast identifier. used by the egress processing to locate the multicast tree information which is used to forward the frame to the appropriate target ports with the correct qos. 3 stake available only for the multicast format of the frame header. this 8-bit field is used by egress processing to locate the start of the layer 3 header. 3 dsu dsu indicator. available only for the unicast format of the frame header. this 4-bit field indi- cates which egress data stores are used by this frame. defined as follows (where ? r ? indicates a reserved bit that is transmitted as ? 0 ? and is not modified or checked on receipt): 0rr0 data store 0 0rr1 data store 1 1rr0 data store 0 and data store 1 1rr1 data store 0 and data store 1 fhf frame header format. software controlled field. this field, with the addition of the uc field, is used by the hardware classifier in the epc to determine the code entry point for the frame. the uc field and the fhf form a 5-bit index into a configurable lookup table of code entry points used for egress processing. sp source port of the frame. fhe frame header extension (32-bit field). used by ingress processing to pass information to egress processing. the contents of this field depend on the fhf value used by egress pro- cessing to interpret the field. information passed reduces the amount of frame parsing required by egress processing when determining how to forward the frame. r reserved field, transmitted as ? 0 ? . should not be modified or checked by the switch fabric. correlator the size of this field depends on the blade operational mode. the sb information in the cell header is copied into the byte occupied by the correlator (overlaying the correlator and some reserved bits) when the frame header is written to the egress data store. this provides sb information for egress processing. 16-blade correlator values 0-63 are used for unicast traffic. correlator values 0-31 are used for multicast traffic. 64-blade correlator values 0-15 are used for unicast traffic. correlator values 0-7 are used for multicast traffic. 1. this field is used by egress reassembly logic in selection the reassembly control block. it is also used by the hardware classifier. 2. this field is passed through the ingress flow control before the frame is enqueued for transmission across the switch interface. it can be used by software to allow the hardware to make further flow control decisions on egress by passing connection information (type and dscp). 3. this field is passed by the hardware but is defined and used by the software.
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 149 of 554 5.2 ingress switch cell interface the ingress switch cell interface (i-sci) provides a continuous stream of transmit data to the dasl. after power on or reset of the network processor, the i-sci initialization process generates synchronization cells for the dasl. when the dasl is synchronized, the i-sci enters the operational state. when there are no data cells to send, the i-sci generates idle cells for transmission via the dasl. when there are data cells to send, the i-sci receives a logical cell from the i-sdm, translates the cell into the switch cell format, and transmits the cell via the dasl. the i-sci performs formatting on the cell headers to conform with the switch cell format. the i-sci unit relies on an internal fifo buffer (written by the i-sdm at the network processor core clock rate and read at the switch clock rate) performing the translation from logical cell format to switch cell format. 5.2.1 idle cell format when there is no data to send, idle cells are transmitted on the swi. all bytes in an idle cell have the value x ? cc ? , except for the first three bytes in the master stream. word 0, word 1, and word 2 of the master stream contain h0, h1, and h2 respectively. . table 5-3. idle cell format transmitted to the switch interface word # slave 1 (byte 0) master (byte 1) slave 3 (byte 2) slave 2 (byte 3) (bits 31:24) (bits 23:16) (bits 15:8) (bits 7:0) w0 x ? cc ? h0 x ? cc ? x ? cc ? w1 x ? cc ? h1 (x ? cc ? )x ? cc ? x ? cc ? w2 x ? cc ? h2 (x ? cc ? )x ? cc ? x ? cc ? w3 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w4 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w5 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w6 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w7 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w8 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w9 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w10 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w11 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w12 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w13 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w14 x ? cc ? x ? cc ? x ? cc ? x ? cc ? w15 crc crc crc crc
ibm powernp np4gs3 network processor preliminary switch interface page 150 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 5.2.1.1 crc bytes: word 15 the cyclic redundancy check (crc) bytes sent on each byte stream contain an 8-bit crc checksum of the polynomial x 8 +x 4 +x 3 +x 2 + 1. each byte stream is independently calculated and the initial value loaded into the crc generator at the end of each idle cell (after word 15) is x ? d0 ? . the final crc value is calculated startingwiththefirstbytefollowinganidlecellupto(butnotincluding)thebytesentaspartofword15inthe next idle cell. figure 5-4 illustrates an example of the crc calculation. 5.2.1.2 i-sci transmit header for the idle cell the idle cell header consists of three bytes, h0, h1, and h2, which are defined as follows: h0. also referred to as the qualifier byte, this 8-bit field indicates the type of cell being transmitted. h1andh2.settox ? cc ? . the following table provides the field definitions for h0: figure 5-4. crc calculation example bit(s) definition 7 reserved. transmitted as ? 0 ? . 6 switch cell header parity. this even parity bit covers the h0-h2 fields. 5:4 data cell indicator. ? 00 ? idle cell ? 11 ? data cell all other values are reserved. 3:2 reserved. transmitted as ? 00 ? . 1:0 reserved. this field should not be examined by the switch. idle cell 0 15 data cell 0 15 data cell 0 15 idle cell 0 15 bytes in stream crc calculated on these bytes x ? d0 ? initial value loaded 14 crc value stored in byte 15
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 151 of 554 5.2.2 switch data cell format - ingress and egress table 5-4 shows the format of data cells sent to and from the swi via the dasl bus. notice that the packet routing switch cell header bytes (h0, h1, and h2) are all contained in the master byte stream. bytes desig- nated ch0 through ch5 are the cell header bytes described in figure 5-2. cell header format on page 144. table5-4.switchdatacellformat word # slave 1 dasl_out_0 dasl_out_1 master dasl_out_2 dasl_out_3 slave 3 dasl_out_4 dasl_out_5 slave 2 dasl_out_6 dasl_out_7 (bits 31:24) (bits 23:16) (bits 15:8) (bits 7:0) w0 ch5 ch0 (h0) d9 d7 w1 d0 ch1 (h1) d4 d2 w2 d1 ch2 (h2) d5 d3 w3 ch4 ch3 d8 d6 w4 d25 d23 d21 d19 w5 d12 d10 d16 d14 w6 d13 d11 d17 d15 w7 d24 d22 d20 d18 w8 d41 d39 d37 d35 w9 d28 d26 d32 d30 w10 d29 d27 d33 d31 w11 d40 d38 d36 d34 w12 d57 d55 d53 d51 w13 d44 d42 d48 d46 w14 d45 d43 d49 d47 w15 d56 d54 d52 d50 1. dasl_out_x (where x {1,3,5,7})carrytheevenbitsoftheindicatedbytes. 2. dasl_out_x (where x {0, 2, 4, 6}) carry the odd bits of the indicated bytes. 3. dxx indicates the data byte in the cell from the sdm interface. 4. chx indicates the cell header.
ibm powernp np4gs3 network processor preliminary switch interface page 152 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 5.3 data-aligned synchronous link interface the data-aligned synchronous link (dasl) interface is a macro that facilitates high speed point-to-point inter- chip communication. it provides the application designer the ability to relieve i/o constraints imposed by the device package or card connector. the dasl interface performs multi-bit serialization and de-serialization to reduce the i/o pin count. the interface is frequency-synchronous, which removes the need for asynchronous interfaces that introduce additional interface latency. dasl links are intended to operate over a back-plane without any additional components. the dasl interface macro was developed to interface the core area of a cmos asic to a high speed link. the macro incorporates all the high-speed circuitry necessary to initialize and perform dynamic link timing and data serialization/de-serialization without exposing the core designer to the specific implementation. the core asic interface to dasl is a parallel interface. the dasl macro is scalable based on application needs. the dasl macro was designed for high levels of integration. it uses reduced voltage differential transceivers to reduce power consumption. the macro has been partitioned such that multiple sub-macros share a common controller as well as a common phase-locked loop (pll). on the most basic level, the dasl macro is designed to provide a 4-to-1 serialization/de-serialization inter- face. the np4gs3 application utilizes a 32- to 8-bit (32-bit port) pin reduction. the shared dasl controller (sdc) provides control for the dasl tx and dasl rx ports.
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 153 of 554 5.4 egress switch cell interface the egress switch cell interface (e-sci) receives cells from the switch fabric and passes data cells to the egress switch data mover (e-sdm). the e-sci discards idle cells received from the switch fabric, checks the parity of the received cells, and discards cells with bad parity. the e-sci strips out the target blade per-port grant information from the switch cells and sends this grant information to the network processor ingress units for use in ingress data flow control. the e-sci assists in egress data flow control by throttling the switch fabric to prevent data overruns. 5.4.1 master and multicast grant reporting the master and multicast grant (master_grant and mc_grant) i/o signals report congestion in the attached switch to the np4gs3. when master_grant (see section table 2-1. signal pin functions on page 38) indi- cates congestion, the ingress scheduler is disabled from sending any traffic to the switch interface. when multicast_grant (see section table 2-1. signal pin functions on page 38) indicates congestion, the ingress scheduler is disabled from sending any multicast traffic to the switch interface. 5.4.2 output queue grant reporting most fields in the cell header must pass through the switch fabric unchanged (idle cells are not transmitted through a switch fabric; they originate at the output port of the switch). the exception to this is the target blade field (see figure 5-2: cell header format on page 144) which contains output queue grant (oqg) information used by the network processor's ingress eds when selecting data to be sent to the switch. a target port is not selected if its corresponding output queue grant information is set to ? 0 ? . further, output queue grant information is sent for each of the three priorities supported by the network processor. this is done using multiple cell transfers; starting with priority 0, reporting output queue grant for all supported target blades, then repeating this process for each priority level until a value of 2 is reached, and starting over at priority 0 again. for 16-blade operational mode, the sequence is: 1. priority 0; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 2. priority 1; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 3. priority 2; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 4. start over at step 1. the idle cell indicates the priority of the oqg it carries. this allows the network processor to synchronize with the external switch. for 64-blade operational mode, the sequence is: 1. priority 0; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 2. priority 0; oqg for blades 16:31 (h1 contains 16:23, h2 contains 24:31) 3. priority 0; oqg for blades 32:47 (h1 contains 32:39, h2 contains 40:47) 4. priority 0; oqg for blades 48:63 (h1 contains 48:55, h2 contains 56:63) 5. priority 1; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 6. priority 1; oqg for blades 16:31 (h1 contains 16:23, h2 contains 24:31) 7. priority 1; oqg for blades 32:47 (h1 contains 32:39, h2 contains 40:47)
ibm powernp np4gs3 network processor preliminary switch interface page 154 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 8. priority 1; oqg for blades 48:63 (h1 contains 48:55, h2 contains 56:63) 9. priority 2; oqg for blades 0:15 (h1 contains 0:7, h2 contains 8:15) 10. priority 2; oqg for blades 16:31 (h1 contains 16:23, h2 contains 24:31) 11. priority 2; oqg for blades 32:47 (h1 contains 32:39, h2 contains 40:47) 12. priority 2; oqg for blades 48:63 (h1 contains 48:55, h2 contains 56:63) 13. start over at step 1. for 64-blade mode, this sequence is interrupted when an idle cell is sent. table 5-7 shows an idle cell that carries the oqg for all blades. status for each priority is given in increasing order; (p0 followed by p1 followed by p2). the network processor restarts the sequence at step one with the arrival of the next data cell. 5.4.2.1 oqg reporting in external wrap mode when the np4gs3 is configured for external wrap mode (see bit 31 in section 13.29.1 dasl configuration register (dasl_config) on page 502), oqg is not collected as described in section 5.4.2 . instead, the master grant i/o signals are used to collect this information. in this mode of operation, only target blades 0 and 1 are valid addresses. oqg status for all other blade addresses is always reported as zero to the ingress scheduler. master_grant_a is connected to send_grant_a of the same np4gs3, while master_grant _b of one np4gs3 is connected to send_grant_b of the other. thus, the oqg status for tb0 is provided by master_grant_a on tb0 and from master_grant_b on tb1. the oqg status for tb1 is provided by master_grant_b on tb0 and from master_grant_a on tb1. internally to both np4gs3s, master grant is derived from the logical and of both master grant i/o signals (per priority).
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 155 of 554 figure 5-6 on page 156 illustrates a single np configuration. note that in this configuration, the dasl a inter- face is used. with switch_bna tied to 1, this makes this interface the "alternate" (see 13.29.1.1 dynamic switch interface selection on page 504). since the np must have the "primary" active, even though it is not used in this configuration, dasl_bypass_wrap must be enabled for the dasl b interface ( 13.29.2 dasl bypass and wrap register (dasl_bypass_wrap) on page 506) for proper operation. my_tb is set to 0 for this blade, and both oqg status and internal master grant is provided by master_grant_a. figure 5-5. external wrap mode (two np4gs3 interconnected) tb1 tb0 dasl_in_b dasl_out_b master_grant_a send_grant_b dasl_in_a send_grant_a dasl_out_a mc_grant_a dasl a master_grant_a dasl_in_a send_grant_a dasl_out_a mc_grant_a dasl a master_grant_b dasl_in_b dasl_out_b send_grant_b master_grant_b dasl b dasl b mc_grant_b mc_grant_b switch_bna switch_bna tie 1 tie 1 tie 1 tie 1 note: in this configuration oqg status and master grant are not differentiated by priority.
ibm powernp np4gs3 network processor preliminary switch interface page 156 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 5.4.3 switch fabric to network processor egress idle cell the egress swi interface requires idle cells when there is no data. an idle cell requires: the last bytes on each stream contain a trailer cyclic redundancy check (crc). for the master stream, the header h0 as described in ta bl e 5 - 5 on page 157. for the master stream, h1 - h2 contain the target blade grant priority information when in 16-blade mode. for the master stream, h1 - h2 contain x ? cc ? when in 64-blade mode. all other bytes contain x ? cc ? when in 16-blade mode or oqg (see ta b l e 5 - 6 on page 157) when in 64-blade mode. figure 5-6. external wrap mode (single np4gs3 configuration) tb0 dasl_out_b master_grant_a dasl_in_a send_grant_a dasl_out_a mc_grant_a dasl a master_grant_b dasl_in_b send_grant _b dasl b mc_grant_b switch_bna tie 1 tie 1 note: in this configuration oqg status and master grant are not differentiated by priority. dasl b must be set up for internal wrap the alternate port must be enabled: dasl_config(31) = 1 (external_wrap_mode) dasl_config(30:29) = '01' (gs_mode) my_tb = 0 [dasl_bypass_wrap_en(31:30) = '10'] tie 1 tie 1
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 157 of 554 . table 5-5. receive cell header byte h0 for an idle cell bit(s) definition 7 reserved. transmitted as ? 0 ? . 6 switch cell header parity. this even parity bit covers the h0-h2 fields. 5:4 data cell indicator. ? 00 ? idle cell ? 11 ? data cell all other values are reserved. 3:2 reserved. transmitted as ? 00 ? . 1:0 output queue grant priority. when in 16-blade mode, indicates the priority level of the output queue grant information carried in h1-h2. otherwise the network processor ignores this field. table 5-6. idle cell format received from the switch interface - 16-blade mode slave 1 master 1 slave 3 slave 2 byte 0 byte 1 byte 2 byte 3 x ? cc ? h0 x ? cc ? x ? cc ? x ? cc ? h1 (oqg (0:7)) x ? cc ? x ? cc ? x ? cc ? h2 (oqg (8:15)) x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? x ? cc ? crc crc crc crc
ibm powernp np4gs3 network processor preliminary switch interface page 158 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001 5.4.4 receive header formats for sync cells sync cells are a special type of idle cells. they are similar to the received idle cell for 16-blade mode (see table 5-6 on page 157) but h0, h1, and h2 are set to a value of x ? cc ? . all sync cells are discarded. table 5-7. idle cell format received from the switch interface - 64-blade mode slave 1 master slave 3 slave 2 byte 0 byte 1 byte 2 byte 3 x ? cc ? h0 x ? cc ? x ? cc ? x ? cc ? h1 (x ? cc ? )x ? cc ? x ? cc ? x ? cc ? h2 (x ? cc ? )x ? cc ? x ? cc ? 01 0 1 232 3 454 5 676 7 898 9 10 11 10 11 12 13 12 13 14 15 14 15 16 17 16 17 18 19 18 19 20 21 20 21 22 23 22 23 24 25 24 25 26 27 26 27 28 29 28 29 30 31 30 31 32 33 32 33 34 35 34 35 36 37 36 37 38 39 38 39 40 41 40 41 42 43 42 43 44 45 44 45 46 47 46 47 48 49 48 49 50 51 50 51 52 53 52 53 54 55 54 55 56 57 56 57 58 59 58 59 60 61 60 61 62 63 62 63 01 0 1 232 3 454 5 676 7 898 9 10 11 10 11 12 13 12 13 14 15 14 15 16 17 16 17 18 19 18 19 20 21 20 21 22 23 22 23 24 25 24 25 26 27 26 27 28 29 28 29 30 31 30 31 32 33 32 33 34 35 34 35 36 37 36 37 38 39 38 39 40 41 40 41 42 43 42 43 44 45 44 45 46 47 46 47 48 49 48 49 50 51 50 51 52 53 52 53 54 55 54 55 56 57 56 57 58 59 58 59 60 61 60 61 62 63 62 63 01 0 1 232 3 454 5 676 7 898 9 10 11 10 11 12 13 12 13 14 15 14 15 16 17 16 17 18 19 18 19 20 21 20 21 22 23 22 23 24 25 24 25 26 27 26 27 28 29 28 29 30 31 30 31 32 33 32 33 34 35 34 35 36 37 36 37 38 39 38 39 40 41 40 41 42 43 42 43 44 45 44 45 46 47 46 47 48 49 48 49 50 51 50 51 52 53 52 53 54 55 54 55 56 57 56 57 58 59 58 59 60 61 60 61 62 63 62 63 crc crc crc crc
ibm powernp np4gs3 preliminary network processor np3_dl_sec05_swi.fm.08 may 18, 2001 switch interface page 159 of 554 5.5 egress switch data mover the egress switch data mover (e-sdm) is the logical interface between the switch cell data flow of the e-sci and the frame data flow of the egress eds. the e-sdm serves as a buffer for cells flowing from the e-sci to the egress eds. the e-sdm extracts control information, such as the frame correlator, which is passed to the egress eds. the egress eds uses this control information, along with data from the e-sdm, to re- assemble the cells into frames.
ibm powernp np4gs3 network processor preliminary switch interface page 160 of 554 np3_dl_sec05_swi.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 161 of 554 6. egress enqueuer / dequeuer / scheduler the egress enqueuer / dequeuer / scheduler (egress eds) interfaces with the physical mac multiplexer (pmm), the embedded processor complex (epc), and the egress switch interface (egress swi). it is responsible for handling all the buffer, frame, and queue management for reassembly and transmission on the egress side of the network processor. the egress eds reassembles data that have been transported from a switch fabric through the data-aligned synchronous link (dasl) interface to the egress pmm. it reassembles frames sent from up to 64 network pro- cessors (up to 3072 simultaneous reassemblies). each switch cell received from the egress swi is examined and stored in the appropriate egress data store (egress ds) for reassembly into its original frame. when the frame is completely received from the swi, the egress eds enqueues it to the epc for processing. once the epc processes the frame, it provides forwarding and quality of service (qos) information to the egress eds. the egress eds then invokes hardware configured flow control mechanisms and then enqueues it to either the scheduler, when enabled, or to a target port (tp) queue for transmission to the egress pmm (which, in turn, sends the data to physical layer devices). the egress eds supports the following functions, as illustrated in figure 6-1 on page 162: external egress data store automatic allocation of buffer and frame control blocks for each frame epc queues (gfq, gtq, gr0, gr1, gb0, gb1, and gpq) target port queues with two priorities buffer thresholds and flow control actions bandwidth and best effort scheduling reading and writing of frame data stored in egress data store up to 512 k buffer twins depending on memory configuration up to 512 k of frame control blocks (fcbs) depending on memory configuration unicast and multicast frames cell packing 40 external ports plus wrap port interface to the pmm discard function hardware initialization of internal and external data structures
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 162 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 6.1 functional blocks figure 6-1 illustrates the functional blocks of the egress eds. the list following the figure explains each func- tional block. figure 6-1. egress eds functional blocks e data store interface writes the external egress ds during frame reassembly and reads them during frame transmission. also gives the epc access to the egress ds. the data store interface supports two external data stores: ds0 and ds1. dpq discard port queue. releases twin buffers back to the free queue stack. the pico- code uses this queue to discard frames where header twins have been allocated. e-gdq discard queue stack. holds frames that need to be discarded. the hardware uses this queue to discard frames when the egress ds is congested or to re-walk a frame marked for discard for a half-duplex port. the picocode uses this queue to discard frames that do not have header twins allocated. egress pcb egress port control block. contains the necessary information to send a frame to the egress pmm for transmission. the egress eds uses this information to walk the twin buffer chain and send the data to the egress pmm ? s output port. there is a pcb entry for each target port, plus one for discard and one for wrap. each entry holds information for two frames: the current frame being sent to the pmm port and the next frame to be sent. data store interface egress pcb rcb release logic mcc tables flow control wrapq dpq tp39q_p1 tp0q_po gb1 gb0 gr1 gr0 gtq gfq fqs egress pmm egress epc e-gdq swi ds gpq fcbfq scheduler
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 163 of 554 fcbfq frame control block free queue. lists free egress fcbs. fcbs store all the infor- mation needed to describe the frame on the egress side, such as starting buffer, length, mcc address, frame alteration, and so on. the egress eds obtains an fcb from the free queue when the epc enqueues the frame to either a flow qcb (see flow queues on page 176) or a tp after epc processing and flow control actions are complete. fqs free queue stack. holds a list of free egress twin buffers which are used by the egress eds during frame reassembly and by the epc during frame alteration. the twin buffers store the frame data or frame alteration header twins and also contain the link pointer to the next buffer in the chain. the fqs obtains a new free twin buffer any time the egress eds needs one and returns free twin buffers after frames are discarded or transmitted. gb0, gb1 low priority data queues. gb0 is for frames stored in egress ds0. gb1 is for frames stored in egress ds1. gfq guided frame queue. queue that contains guided frames for delivery for the egress side of the network processor to the guided frame handler. gpq powerpc queue. queue that contains frames re-enqueued for delivery to the general powerpc handler (gph) thread for processing. gr0, gr1 high priority data queues. gr0 is for frames stored in egress ds0. gr1 is for frames stored in egress ds1. gtq general table queue. queue that contains guided frames re-enqueued by pico- code for delivery to the general table handler (gth) thread. mcc table multicast count table. each entry stores the number of multicast frames associ- ated with a particular set of twin buffers. if a frame is to be multicast to more than one target port, the epc enqueues the frame to each target port, causing an entry in the mcc table to be incremented. as each target port finishes its copy of the frame, the mcc table entry is decremented. when all ports have sent their copies of the frame, the associated twin buffers are released. rcb reassembly control block. used by the egress eds to reassemble the cells received from the switch fabric into their original frames. contains pointers to the egress ds to specify where the contents of the current cell should be stored. helps the egress eds keep track of the frame length and which epc queue to use. release logic releases twin buffers after the pmm has finished with the contents of the buffer. the release logic checks the mcc table to determine if the buffer can be released or is still needed for some other copy of a multicast frame. tp0q - tp39q target port queues. hold linked lists of frames destined for a tp. two queues are associated with each of the 40 possible tps. these queues are prioritized from high (p0) to low (p1) using a strict priority service scheme (all higher priority queues within a target port set must be empty before starting a lower priority queue).
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 164 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 6.2 operation the egress eds receives switch cells from the egress switch interface (swi) along with information that has been preprocessed by the egress switch data mover (sdm) such as the reassembly correlator, source blade, multicast indication, priority, and target data store (ds). in order to optimize the data transfer to an egress ds, the egress eds uses the target ds ? s information to determine which egress sdm to service. the two egress dss, ds0 and ds1, use alternating write windows, meaning the egress eds can write one cell to one data store each ? cell window time ? (the time needed to store an entire switch cell (64 bytes) in external dram). cell window time is configured using the dram parameter register ? s 11/10 field. after the appropriate egress sdm has been selected, the egress eds reads the cell data out of the sdm and uses the reassembly correlator, source blade, multicast indication, and priority information to index into the reassembly control block (rcb). the rcb contains all the information needed to reassemble the frame, including the buffer address to use in the egress ds. the cell data is stored in the buffer and the rcb is updated to prepare for the next cell associated with the same frame. the egress eds manages 3072 rcb entries, and each entry contains information such as start-of-frame indicator, data store buffer address, current reassembled frame length, and queue type. cells from several source blades are interleaved coming from the swi, and the egress eds uses the rcb information to rebuild each frame. the egress eds uses a free buffer from the head of the free queue stack (fqs) as needed to store frame data in the egress ds. the egress eds stores the cell data in the appropriate buffer and also stores the buffer chaining information over the cell header data. when a packed cell arrives, the egress eds stores each frame ? s information in two separate twin buffers. the first portion of the packed cell contains the end of a frame. this data is stored in the appropriate twin buffer as pointed to by the rcb. the remaining portion of the packed cell is the beginning of another frame and this data is stored in a second twin buffer as indicated by the second frame ? s rcb entry. figure 6-2 illustrates the cell buffers and storage in an egress ds. wrapq wrap queue. two wrap queues, one for guided frames and one for data frames, send frames from the egress side to the ingress side of the network processor. these queues allow the network processor to respond to a guided frame sent from a remote control point function (cpf). the guided frame is received from the switch fabric on the egress side and is wrapped to the ingress side to allow a response to be sent to the cpf across the switch fabric. a data frame may be wrapped when processing on a remote cpf is required.
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 165 of 554 when the entire frame is reassembled, the egress eds enqueues it to one of several epc queues. if the reassembled frame is a guided frame, the egress eds uses the gfq. high priority frames are placed in either the gr0 or the gr1. low priority frames are placed in either the gb0 or the gb1. the epc services these queues and requests a programmable amount of read data from the various frames in order to process them (see table 7-78: port configuration memory content on page 274). the egress eds reads the data from the egress data store and passes it back to the epc. additional reads or writes can occur while the epc is processing the frame. the egress eds performs all necessary reads or writes to the egress data store as requested by the epc. when the frame has been processed, the epc enqueues the frame to the egress eds. if the frame ? s destina- tion is the general table handler (gth), the egress eds enqueues the frame to the gtq. if the frame is to figure 6-2. cell formats and storage in the egress ds ch 1 (0:5) fh 1 (0:9) fd 1 (00:15) fd 1 (16:31) fd 1 (32:47) ch 1 (0:5) fh 1 (0:9) fd 1 (00:15) fd 1 (16:31) fd 1 (32:47) ch 1 (0:5) fd 1 (48:57) fd 1 (58:73) fd 1 (74:89) fd 1 (90:105) lp 1 (0:5) fd 1 (48:57) fd 1 (58:73) fd 1 (74:89) fd 1 (90:105) fqs fqs ch 2 (0:5) fd 2 (22:31) fd 2 (32:47) fd 2 (48:63) fd 2 (64:79) lp 2 (0:5) fd 2 (22:31) fd 2 (32:47) fd 2 (48:63) fd 2 (64:79) ch 1 (0:5) fd 1 (106:115) fd 1 (116:120) fh 2 (0:9) fd 2 (00:05) fd 2 (06:21) ch 1 (0:5) fd 1 (106:115) fd 1 (116:120) na (0:10) fh 2 (0:9) fd 2 (00:05) fd 2 (06:21) 128-byte linked twin buffers for frame 1 128-byte linked twin buffers for frame 2 non-packed start of frame cell non-packed continuation cell packed end of frame cell non-packed continuation cell ch = cell header fh = frame header fd = frame data lp = link pointer na = unused bytes data buffer data buffer twin buffer
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 166 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 be discarded, it is placed in the dpq. if the frame needs to be wrapped to the ingress side, it is placed in the wrap queue. all other frames are subject to flow control actions. if flow control does not discard the frame, the frame is placed into the scheduler, if enabled, or placed directly into a target port queue. each target port supports two priorities and therefore has two queues. the epc indicates which target port and which priority should be used for each frame enqueued. the two queues per port use a strict priority scheme, which means that all high priority traffic must be transmitted before any lower priority traffic will be sent. frames destined for the scheduler, target ports, or wrap ports have a frame control block (fcb) assigned by the egress eds. the fcb holds all the information needed to transmit the frame including starting buffer address, frame length, and frame alteration information, as illustrated in figure 6-3 . if the epc needs to forward more than one copy of the frame to different target ports, an entry in the mcc table is used to indi- cate the total number of copies to send. the epc enqueues the frame to different target ports or different flow qcbs and each enqueue creates a new fcb with (possibly) its own unique frame alteration. figure 6-3. tpq, fcb, and egress frame example target port queues frame control blocks target port 5 target port 9 multicast frame last frame in target port queue 5 first frame in target port queue 9 unicast frame last frame in target port queue 9 unicast frame first frame in target port queue 5 hc=2t hc=2t fba fl = y fba fl = y fba fl = x fba fl = z h = head t=tail c = queue count fba = first buffer address fl = frame length buf = buffer lp = link pointer frame length =xbytes frame length =ybytes frame length =zbytes buf 32 buf 12 buf 3 buf 9 buf 7 buf 23 buf 15 buf 39 buf 99 lp = 9 lp = 15 lp = 23 lp = 39 lp = 99 lp = 7
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 167 of 554 when a frame reaches the head of a target port queue, it is placed in the egress port control block (pcb) entry for that port and the fcb for that frame is placed on the fcb free queue. the egress eds uses the pcb to manage frames being sent to the pmm for transmission. the pcb stores information needed to retrieve frame data, such as current buffer address and frame length, from the egress ds. the pcb allows up to 40 frames to be retrieved simultaneously from the egress ds. the pcb also supports the wrap port and the discard port queue. as data is retrieved from the egress ds and passed to the pmm, the pcb moni- tors transfers and stores buffer link pointers that enable the pcb to walk the buffer chain for the frame. when the entire buffer chain has been traversed, the pcb entry is updated with the next frame for that target port. as the pmm uses the data from each buffer, it passes the buffer pointer back to the release logic in the egress eds. the release logic examines the mcc table entry to determine if the buffer should be returned to the fqs. (half-duplex ports will not have their twin buffers released until the entire frame has been trans- mitted, in order to to support the recovery actions that are necessary when a collision occurs on the ethernet media.) if the mcc entry indicates that no other copies of this frame are needed, the buffer pointer is stored in the fqs. however, if the mcc entry indicates that other copies of this frame are still being used, the egress eds decrements the mcc entry, but does no further action with this buffer pointer. the dpq contains frames that have been enqueued for discard by the epc and by the release logic. the hardware uses the dpq to discard the last copy of frames transmitted on half duplex ports. the dpq is dequeued into the pcb ? s discard entry, where the frame data is read from the ds to obtain buffer chaining information necessary to locate all twin buffers of the frame and to release these twin buffers back to the free pool (fqs). 6.3 egress flow control flow control (whether to forward or discard frames) in the network processor is provided by hardware assist mechanisms and picocode that implements a selected flow control algorithm. in general, flow control algo- rithms require information about the congestion state of the data flow, including the rate at which packets arrive, the current status of the data store, the current status of target blades, and so on. a transmit proba- bility for various flows is an output of these algorithms. the network processor implements flow control in two ways: flow control is invoked when the frame is enqueued to either a target port queue or a flow queue control block (qcb). the hardware assist mechanisms use the transmit probability along with tail drop conges- tion indicators to determine if a forwarding or discard action should be taken during frame enqueue oper- ation. the flow control hardware uses the picocode ? s entries in the egress transmit probability memory to determine what flow control actions are required. flow control is invoked when frame data enters the network processor. when the egress ds is suffi- ciently congested, these flow control actions discard all frames. the threshold that controls the invocation of these actions is fq_es_threshold_0 (see table 6-1: flow control hardware facilities on page 168 for more information). 6.3.1 flow control hardware facilities the hardware facilities listed in table 6-1 are provided for the picocode's use when implementing a flow control algorithm. the picocode uses the information from these facilities to create entries in the egress transmit probability memory. the flow control hardware uses these entries when determining what flow control actions are required.
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 168 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 table 6-1. flow control hardware facilities (page 1 of 2) name definition access free queue count instantaneous count of the number of free twins available in the egress data store. cab fq_es_threshold_0 threshold for free queue count. when the free queue count < fq_es_threshold_0, no further twins are allocated for incoming data. user packets that have started reassem- bly are discarded when they receive data when this threshold is violated. guided traffic is not discarded. the number of packets discarded is counted in the reassembly discard counter. when this threshold is violated, an interrupt (class 0, bit 4) is signaled. cab fq_es_threshold_1 threshold for free queue count. when the free queue count < fq_es_threshold_1, an interrupt (class 0, bit 5) is signaled. cab fq_es_threshold_2 threshold for free queue count. when the free queue count < fq_es_threshold_2, an interrupt (class 0, bit 6) is signaled, and if enabled by dmu configuration, the ethernet mac preamble is reduced to 6 bytes. cab arrival rate counter the arrival rate of data into the egress data store. this counter increments each time there is a dequeue from the twin free queue. when read by picocode via the cab, this counter is set to 0 (read with reset). cab fq count ewma the calculated ewma of the free queue count. cab p0 twin count the number of twins in priority 0 packets that have been enqueued to flow queues, but have not been dequeued from target port queues. cab p1 twin count the number of twins in priority 1 packets that have been enqueued to flow queues, but have not been dequeued from target port queues. cab p0 twin count threshold threshold for p0 twin count. it is used when determining flow control actions against pri- ority 0 traffic. when p0 twin count > p0 twin count threshold, the flow control hardware will discard the frame. cab p1 twin count threshold threshold for p1 twin count. it is used when determining flow control actions against pri- ority 1 traffic. when p1 twin count > p1 twin count threshold, the flow control hardware will discard the frame. cab egress p0 twin count ewma the calculated ewma based on the count of the number of twins allocated to p0 traffic during the sample period. hardware maintains a count of the number of twins allocated to p0 traffic at enqueue. cab egress p1twin count ewma the calculated ewma based on the count of the number of twins allocated to p1 traffic during the sample period. hardware maintains a count of the number of twins allocated to p1 traffic at enqueue. cab egress p0 twin count ewma threshold the congestion status of the egress data store in each target blade. it is the result of a comparison between this configured threshold and the ewma of the offered rate of prior- ity 0 traffic (p0 twin count ewma > p0 twin count ewma threshold). this information is transmitted to remote blades via the res_data i/o and is collected in the remote tb sta- tus 0 register in the ingress flow control hardware. cab
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 169 of 554 6.3.2 remote egress status bus the remote egress status (res) bus communicates the congestion state of the egress data stores of all nets in a system to the ingress flow control of every np4gs3 in that system. the ingress portion of each np4gs3 can then preemptively discard frames destined for a congested np4gs3 without consuming addi- tional bandwidth through the switch. 6.3.2.1 bus sequence and timing the res bus consists of two bidirectional signals: 1. res_data . this signal is time-division multiplexed between all the np4gs3s in a system. only one np4gs3 at a time drives this signal. each np4gs3 drives two priorities of congestion status sequentially. all of the np4gs3s in a system sample this data and store it for use by the ingress flow control to make discard decisions. 2. res_sync . this signal is received by all np4gs3s in a system. each np4gs3 samples the negative edge of this signal to derive its time slot to drive egress data store congestion information. this signal is periodic and repeats to allow the np4gs3s to resynchronize after 16 or 64 time slots have passed, depending on the value of the tb mode register. in a system, one np4gs3 can be configured to drive this signal to the other np4gs3s. alternatively, an external device can provide the stimulus for this signal. figure 6-4 shows a timing diagram of the operation of the res bus. egress p1 twin count ewma threshold the congestion status of the egress data store in each target blade. it is the result of a comparison between this configured threshold and the ewma of the offered rate of prior- ity 0 traffic. (p1 twin count ewma > p1 twin count ewma threshold). this information is transmitted to remote blades via the res_data i/o and is collected in the remote tb status 1 register in the ingress flow control hardware. cab target port pq+fq_th target port port queue plus egress scheduler (flow qcb) threshold. the target port queues maintain a count of the number of twins allocated to the target port. the count is incremented on enqueue (after flow control transmit action is taken) and decremented on dequeue from the target port. thresholds for each priority can be configured for for all ports (0:39). when the number of twins assigned to a target port queue exceeds the threshold, its threshold exceed status is set to 1. the status is used to index into the transmit probability memory for packets going to the target port. cab qcb threshold flow queue threshold. when the number of twins assigned to this flow queue exceeds this threshold, the threshold exceed status is set to 1. the status is used to index into the transmit probability memory. cab table 6-1. flow control hardware facilities (page 2 of 2) name definition access
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 170 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 the res bus operates on the same clock frequency as the internal dasl clock. this clock period can range from 8 ns to 10 ns. however, the res bus clock is not necessarily in phase with the dasl clock. in addition, each np4gs3 is not necessarily in phase with other np4gs3s in the same system. the res_sync signal is responsible for synchronizing all the np4gs3s ? usage of the res data bus. the res bus is time-division multiplexed between every np4gs3 in the system. each np4gs3 moves its congestion information onto the res bus in the order of its my_tb register (that is, the np4gs3 with a my_tb setting of 0 drives immediately after the fall of res_sync, followed by the np4gs3 with a my_tb setting of 1, and so on). within each np4gs3 time slot, the np4gs3 puts four values on the res_data signal. each value is held for eight dasl clock cycles. therefore, each np4gs3 time slot is 32 dasl clock cycles. the protocol for sending the congestion information is as follows: 1. high-z . the np4gs3 keeps its res_data line in high impedance for eight dasl clock cycles. this allows the bus to turn around from one np4gs3 to another. 2. p0 . the np4gs3 drives its priority 0 (p0) egress data store congestion information for eight dasl clock cycles. this status is highwhen the egress p0 twin count ewma register value is greater than the egress p0 twin count ewma threshold register value. it is low otherwise. 3. p1 . the np4gs3 drives its priority 1 (p1) egress data store congestion information for eight dasl clock cycles. this status is high when the egress p1 twin count ewma register value is greater than the egress p1 twin count ewma threshold register value. it is low otherwise. 4. reserved . the np4gs3 drives a low value for eight dasl clock cycles. this is reserved for future use. the ingress flow control samples res_data during the midpoint of its eight dasl clock cycles. this provides a large enough sample window to allow for jitter and phase differences between the dasl clocks of two different np4gs3s. the res_sync signal is driven high during the reserved cycle of the last np4gs3 ? s time slot. the res_sync signal therefore has a period equal to 32 dasl clock periods multiplied by the number of np4gs3s supported in this system. the number of np4gs3s can be either 16 or 64 depending on the value ofthetbmoderegister. figure 6-4. res bus timing res_sync res_data p1 reserved z p0 p1 reserved z p0 1 2 3 4 np 0 np 1 np n 8 dasl cycles sampling points
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 171 of 554 6.3.2.2 configuration the res bus is activated by enabling the ingress and egress functions of the internal res data logic via the res_bus_configuration_en register. three bits in this register enable different functions: the i_res_data_en bit enables the ingress logic to capture the res_data. the e_res_data_en bit enables the egress logic to send its status on res_data. the e_res_sync_en bit enables one of the np4gs3s to drive res_sync. in a standard blade configuration (one np4gs3 on each blade in a system), the res bus supports up to two np4gs3s without using external transceivers. the res_sync and res_data lines are wired together, and one of the np4gs3s is configured to drive res_sync. if, however, more than two np4gs3s are to coexist in a system and external transceivers are not used, a hub-based configuration must be implemented. figure 6-5 shows such a configuration. the hub is necessary to drive the required signal strength to all of the np4gs3s in a system when not using external transceivers, and is connected point-to-point to each np4gs3. the hub collects all res_data infor- mation from the egress logic of every np4gs3 and then distributes it based on the timing diagram in figure 6-5 to every np4gs3 ? s ingress logic. the res_sync signal may be generated by either a solitary np4gs3 or by the hub itself in this configuration. figure 6-5. hub-based res bus configuration to support more than two np4gs3s res_sync res_data np 0 np 1 np n res bus hub res_sync res_sync res_data res_data
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 172 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 6.3.3 hardware function 6.3.3.1 exponentially weighted moving average the hardware generates exponentially weighted moving average (ewma) values for the free queue count and the p0/p1 twin counts, thus removing the burden of this calculation from the picocode. in general, ewma for a counter x is calculated as follows: ewma_x = (1 - k) * ewma_x + k * x this calculation occurs for a configured sample period and k {1,1/2,1/4,1/8}. 6.3.3.2 flow control hardware actions when the picocode enqueues a packet to be transmitted to a target port, the flow control logic examines the state of the target port pq+fq_th, and the priority of the enqueued packet to determine if any flow control actions are required. if the fcinfo field of the fcbpage of the enqueued packet is set to x ? f ? , flow control is disabled and the packet is forwarded without regard to any of the congestion indicators. for priority 0 packets, if target port pq+fq_th or p0 twin count threshold is exceeded, then the packet is discarded. the picocode must set up a counter block to count these discards. for priority 1 packets, if p1 twin count threshold is exceeded, then the packet will be discarded. other- wise the transmit probability found in the qcb (available only when the scheduler is enabled) and the transmit probability table are accessed. the smaller of these values is compared against a random num- ber ( { 0 ... 1} ) generated by the hardware. when the transmit probability is zero or is less than the ran- dom number, the packet is discarded. the picocode must set up a counter block to count these discards. the index into the transmit probability table is ttccfp, where: tt packet type (egress fcbpage fcinfo field bits 3:2). cc dscp assigned color (egress fcbpage fcinfo field bits 1:0) f threshold exceeded status of the target flow queue (qcb threshold exceeded) p threshold exceeded status of the target port queue (target port pq+fq_th exceeded)
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 173 of 554 6.4 the egress scheduler the egress scheduler provides shaping functions in the network processor. the egress scheduler manages bandwidth on a per frame basis by determining the bandwidth a frame requires (that is, the number of bytes to be transmitted) and comparing this against the bandwidth permitted by the configuration of the frame ? sflow queue. the bandwidth used by the first frame affects when the scheduler permits the transmission of the second frame of a flow queue. the egress scheduler characterizes flow queues with the parameters listed in table 6-2 . table 6-3 lists valid combinations of the parameters described in table 6-2 . figure 6-6 presents a graphical representation of the egress scheduler. table 6-2. flow queue parameters parameter description low -latency sustainable bandwidth (lls) provides guaranteed bandwidth with qualitative latency reduction. lls has higher service priority than nls. flow queues connected to lls have bet- ter latency characteristics than flow queues connected to nls. normal-latency sustainable bandwidth (nls) provides guaranteed bandwidth. peak bandwidth service (pbs) provides additional bandwidth on a best-effort basis. queue weight allows the scheduler to assign available (that is, not assigned or currently not used by lls and nls) bandwidth to flow queues using best effort or pbs services. assignment of different ratios of the available bandwidth is accomplished by assign- ing different queue weights to queues that share the same target port. table 6-3. valid combinations of scheduler parameters qos lls nls weight pbs low latency with guaranteed bw shaping x normal latency with guaranteed bw shaping x best effort x best effort with peak rate x x normal latency with guaranteed bw shaping and best effort xx normal latency with guaranteed bw shaping and best effort and peak rate xxx
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 174 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 figure 6-6. the egress scheduler 5 nls flow 0 flow 1 lls 511 0 511 0 511 511 flowid 0 flowid 511 0 511 0 511 511 flowid 0 wfq port 39 peak bandwidth service (pbs) 511 0 511 0 511 511 flowid 0 1 0 flow 2047 511 flowid 0 wfq port 0 511 0 3 1 2 4 2 1 3 discard wrap port 0 port 1 port 39 port 0 port 39 port 1 port 38 port 38 high-priority target port queues low-priority target port queues 4
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 175 of 554 6.4.1 egress scheduler components the egress scheduler consists of the following components: scheduling calendars 2047 flow queues (flow qcb addresses 1-2047) and 2047 associated scheduler control blocks (scbs) target port queues discard queue wrap queue 6.4.1.1 scheduling calendars the egress scheduler selects a flow queue to service every scheduler_tick. the duration of a scheduler_tick is determined by the configuration of the dram parameter register 11/10 field (bit 6 only). when this register field is set to 0, a scheduler_tick is 150 ns. when the field is set to 1, a scheduler_tick is 165 ns. there are three types of scheduling calendars used in the egress calendar design: time-based, weighted fair queuing (wfq), and wrap. time-based calendars the time-based calendars are used for guaranteed bandwidth (lls or nls) and for peak bandwidth service (pbs). weighted fair queuing calendars the wfq calendars allocate available bandwidth to competing flows on a per-port basis. available bandwidth is the bandwidth left over after the flows in the time-based calendars get their bandwidth. a wfq calendar is selected for service only when no service is required by the time-based calendars and the target port queue does not exceed a programmable threshold. the use of this threshold is the method that assures the wfq calendar dequeues frames to the target port at a rate equal to the port's available bandwidth. wrap calendar the wrap calendar is a 2-entry calendar for flows that use the wrap port. calendar type selection algorithm selection among calendar types occurs each scheduler_tick based on the following priority list: 1. time-based lls 2. time-based nls 3. time-based pbs 4. wfq 5. wrap
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 176 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 6.4.1.2 flow queues there are 2047 flow queues (flow queue control block (qcb) addresses 1-2047) and scheduler control blocks. a flow qcb contains information about a single flow, as well as information that must be configured before the flow qcb can be used. configuring sustainable bandwidth (ssd) the sustained service rate (ssr) is defined as the minimum guaranteed bandwidth provided to the flow queue. it is implemented using either the lls or nls calendars. the following transform is used to convert the ssr from typical bandwidth specifications to scheduler step units (ssd field in the qcb): ssd = (512 (bytes) / scheduler_tick (sec)) / sustained_service_rate (bytes/sec) the flow qcb ssd field is entered in exponential notation as x *16 y , where x is the ssd.v field and y is the ssd.e field . when an ssr is not specified, the flow qcb ssd field must be set to 0 during configuration. values of ssd that would cause the calendars to wrap should not be specified. once the maximum frame size is known, ssd max canbeboundby: ssd max 1.07 * (10e + 9) / maximum frame size (bytes) an ssd value of 0 indicates that this flow queue has no defined guaranteed bandwidth component (ssr). configuring flow qcb flow control thresholds when the number of bytes enqueued in the flow queue exceeds the flow qcb threshold (th), the flow queue is congested. the flow control hardware uses this congestion indication to select the transmit proba- bility. the following transform is used to specify this threshold: th.v and th.e fields are components of the flow control threshold field . a twin buffer contains approxi- mately 106 bytes. a th value of 0 disables threshold checking. configuring best effort service (qd) the queue weight is used to distribute available bandwidth among queues assigned to a port. the remaining available bandwidth on a port is distributed among contending queues in proportion to the flow ? s queue weight. th.e threshold value (units in twin buffers) 0 th.v * 2**6 1 th.v * 2**9 2th.v*2**12 3th.v*2**15
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 177 of 554 the following transform is used to convert the queue weight into scheduler step units (qd field in the qcb): qd = 1 / (queue weight) a qd of 0 indicates that the flow queue has no best effort component. configuring peak best effort bandwidth (psd) peak service rate is defined as the additional bandwidth that this flow queue is allowed to use (the difference between the guaranteed and the peak bandwidth). for example, if a service level agreement provided for a guaranteed bandwidth of 8 mbps and a peak bandwidth of 10 mbps, then the peak service rate is 2 mbps. the following transform is used to convert peak service rate from typical bandwidth specifications to sched- uler step units (psd field in the qcb): psd = (512 (bytes) / scheduler_tick (sec)) / peak_service_rate (bytes/sec) the flow qcb psd field is expressed in exponential notation as x *16 y , where x is the psd.v field and y is the psd.e field. a psd value of 0 indicates that this flow queue has no peak service rate component. target port the destination target port (tp) id. target port priority the target port priority (p) field selects either the low-latency sustainable bandwidth (lls) or normal-latency sustainable bandwidth (nls) calendar for the guaranteed bandwidth component of a flow queue. values of 0 (high) and 1 (low) select the lls or nls calendar respectively. during an enqueue to atp queue, the p selects the correct queue. this field is also used in flow control. transmit probability flow control uses the transmit probability field. the picocode flow control algorithms update this field periodi- cally. 6.4.1.3 target port queues there are 82 target port queues, including 40 target ports with two priorities, a discard port, and a wrap port. the scheduler dequeues frames from flow qcbs and places them into the target port queue that is desig- nated by the flow qcb ? s target port (tp) and priority (p) fields. a combination of a work-conserving round- robin and absolute priority selection services target port queues. the round-robin selects among the 40 media ports (target port ids 0 - 39) within each priority class (high and low, as shown in figure 6-6 on page 174). the absolute priority makes a secondary selection based on the following priority list: 1. high priority target port queues 2. low priority target port queues 3. discard queue 4. wrap queue
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 178 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001 the following information must be configured for each target port queue. port queue threshold (th_pq) when the number of bytes enqueued in the target port queue exceeds the port queue threshold, the corre- sponding weighted fair queueing (wfq) calendar cannot be selected for service. this back pressure to the wfq calendars assures that best effort traffic is not allowed to fill the target port queue ahead of frames dequeued from the low-latency sustainable bandwidth (lls) and normal-latency sustainable bandwidth (nls) calendars. the back pressure is the mechanism by which the target port bandwidth is reflected in the operation of the wfq calendars. the following transform is used to specify th_pq: th_pq = port_queue_threshold (bytes) / 106 (bytes) when configuring this threshold, use a value larger than the maximum transmission unit (mtu) of the target port. port queue + flow queue threshold (th_pq+fq) this is the threshold for the total number of bytes in the target port queue plus the total number of bytes in all flow queues that are configured for this target port. when this threshold is exceeded by the value in the queue, the target port is congested. the flow control mechanisms use this congestion indication to select a transmit probability. the following transform is used to specify th_pq+fq: th_pq+fq = port_queue+scheduler_threshold (bytes) / 106 (bytes) 6.4.2 configuring flow queues table 6-4 illustrates how to configure a flow qcb using the same set of scheduler parameter combinations found in table 6-3: valid combinations of scheduler parameters on page 173. table 6-4. configure a flow qcb qos qcb.p qcb.sd qcb.psd qcb.qd low latency with guaranteed bw shaping 0 00 0 normal latency with guaranteed bw shaping best effort 1 00 0 best effort 1 0 0 0 best effort with peak rate 1 0 0 0 normal latency with guaranteed bw shaping with best effort 1 00 0 normal latency with guaranteed bw shaping with best effort, and peak rate 1 0 0 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec06_eeds.fm.08 may 18, 2001 egress enqueuer / dequeuer / scheduler page 179 of 554 6.4.2.1 additional configuration notes once the scheduler is enabled via the memory configuration register, the picocode is unable to enqueue directly to a target port queue. to disable the scheduler, the software system design must assure that the scheduler is drained of all traffic. the control access bus (cab) can examine all target port queue counts; when all counts are zero, the scheduler is drained. a flow queue control block (qcb) must be configured for discards (tp = 41). a sustained service rate must be defined (ssd 0). a peak service rate and queue weight must not be specified (psd = 0and qd = 0). two flow qcbs must be defined for wrap traffic. one must be defined for guided traffic (tp = 40), and one for frame traffic (tp = 42). for these qcbs, the peak service rate and sustained service rate must not be specified (psd = 0andssd = 0). queue weight must not be zero (qd 0).
ibm powernp np4gs3 network processor preliminary egress enqueuer / dequeuer / scheduler page 180 of 554 np3_dl_sec06_eeds.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 181 of 554 7. embedded processor complex 7.1 overview the embedded processor complex (epc) performs all processing functions for the np4gs3. it provides and controls the programmability of the np4gs3. in general, the epc accepts data for processing from both the ingress and egress enqueuer / dequeuer / schedulers. the epc, under picocode control and with hardware- assisted coprocessors, determines what forwarding action is to be taken on the data. the data may be forwarded to its final destination, or may be discarded. the epc consists of the following components, as illustrated in figure 7-1 on page 184: eight dyadic protocol processor units (dppu) each dppu consists of two core language processors (clps), nine shared coprocessors, one copro- cessor data bus, one coprocessor command bus, and a shared memory pool. each clp contains one arithmetic and logic unit (alu) and supports two picocode threads, so each dppu has four threads (see section 7.1.1 thread types on page 185 for more information). although there are 32 independent threads, each clp can execute the command of only one of its picocode threads, so at any instant only 16 threads are executing on all of the clps. the clps and coprocessors contain independent copies of each thread ? s registers and arrays. most coprocessors perform specialized functions as described below, and can operate concurrently with each other and with the clps. interrupts and timers the np4gs3 has four interrupt vectors. each interrupt can be configured to initiate a dispatch to occur to one of the threads for processing. the device also has four timers that can be used to generate periodic interrupts. instruction memory np4gs3a (r1.1): the instruction memory consists of eight embedded rams that are loaded during ini- tialization and contain the picocode for forwarding frames and managing the system. the total size is 16 k instructions. the memory is 4-way interleaved with four rams for the first 8 k instructions and four rams for the remaining 8 k instructions. each interleave provides four instruction words per access. np4gs3b (r2.0): the instruction memory consists of eight embedded rams that are loaded during ini- tialization and contain the picocode for forwarding frames and managing the system. the total size is 32 k instructions. the memory is 8-way interleaved, with each interleave providing four instruction words per access.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 182 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 control store arbiter (csa) the csa controls access to the control store (cs) which allocates memory bandwidth among the threads of all the dyadic protocol processors. the cs is shared among the tree search engines and the picocode can directly access the cs through commands to the tree search engine (tse) coprocessor. the tse coprocessor also accesses the cs during tree searches. dispatch unit the dispatch unit dequeues frame information from the ingress-eds and egress-eds queues. after dequeue, the dispatch unit reads part of the frame from the ingress or egress data store (ds) and places it into the datapool. as soon as a thread becomes idle, the dispatch unit passes the frame (with appropriate control information) to the thread for processing. the dispatch unit also handles timers and interrupts by dispatching the work required for these to an available thread. completion unit (cu) the cu performs two functions: - it provides the interfaces between the epc and the ingress and egress edss. each eds performs an enqueue action whereby a frame address, together with appropriate parameters, is queued in a transmission queue or a dispatch unit queue. - the cu guarantees frame sequence. since multiple threads can process frames belonging to the same flow, the cu ensures that all frames are enqueued in the ingress or egress transmission queues in the proper order. hardware classifier (hc) the hc parses frame data that is dispatched to a thread. the results are used to precondition the state of a thread by initializing the thread ? s general purpose and coprocessor scalar registers and a starting instruction address for the clp. parsing results indicate the type of layer 2 encapsulation, as well as some information about the layer 3 packet. recognizable layer 2 encapsulations include ppp, 802.3, dix v2, llc, snap header, and vlan tagging. reportable layer 3 information includes ip and ipx network protocols, five programmable network protocols, the detection of ip option fields, and transport protocols (udp, tcp) for ip. ingress and egress data store interface and arbiter each thread has access to the ingress and egress data store through a data store coprocessor. read access is provided when reading ? more data ? and write access is provided when writing back the con- tents of the shared memory pool (smp) to the data store. one arbiter is required for each data store since only one thread at a time can access either data store. control access bus (cab) arbiter each thread has access to the cab, which permits access to all memory and registers in the network processor. the cab arbiter arbitrates among the threads for access to the cab.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 183 of 554 debugging and single-step control the cab enables the gfh thread to control each thread on the device for debugging purposes. for example, the cab can be used by the gfh thread to run a selected thread in single-step execution mode. (see section 12. debug facilities on page 435 for more information.) policy manager the policy manager is a hardware assist of the epc that performs policy management on up to 1 k ingress flows. it supports four management algorithms. the two algorithm pairs are ? single rate three color marker ? and ? two rate three color marker, ? both of which can be operated in color-blind or color- aware mode. the algorithms are specified in ietf rfcs 2697 and 2698, available at http://www.ietf.org . counter manager the counter manager is a hardware assist engine used by the epc to manage counters defined by the picocode for statistics, flow control, and policy management. the counter manager is responsible for counter updates, reads, clears, and writes. the counter manager's interface to the counter coprocessor provides picocode access to these functions. semaphore manager the semaphore manager assists in controlling access to shared resources, such as tables and control structures, through the use of semaphores. it grants semaphores either in dispatch order (ordered sema- phores) or in request order (unordered semaphores). ludeftable, comptable, and free queues are tables and queues for use by the tree search engine. for more information, see section 8.2.5.1 the ludeftable on page 311.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 184 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-1. embedded processor complex block diagram control store arbiter h0 h1 z0 d1 d2 d6 on-chip memories off-chip memories dispatch unit ingress ds egress ds interface instruction memory cab arbiter debug & single step control ingress eds queue egress eds queue ingress data ingress data store interface (rd) egress data egress data store interface (rd) cab ludeftable comptable freeqs interrupts freezeepc exception d0 completion unit counter manager policy manager hardware classifier d3 powerpc core interrupt dppu 1 dppu 8 tree search engine store and arbiter (rd+wr) and arbiter (rd+wr) store interface interface interrupts & timers 405 semaphore manager
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 185 of 554 7.1.1 thread types the epc has 32 threads that can simultaneously process 32 frames. a thread has a unique set of general purpose, scalar, and array registers, but shares execution resources in the clp with another thread and execution resources in the coprocessors with three other threads. each clp within a dppu can run two threads, making four threads per dppu, or 32 total. the first dppu contains the gfh, gth, and the ppc threads and the other seven dppus contain the gdh threads. these five types of threads are described in the following list: general data handler (gdh) there are 28 gdh threads. gdhs are used for forwarding frames. guided frame handler (gfh) there is one gfh thread available in the epc. a guided frame can only be processed by the gfh thread, but the gfh can be configured to process data frames like a gdh thread. the gfh executes guided frame-related picocode, runs device management related picocode, and exchanges control information with a control point function or a remote network processor. when there is no such task to perform and the option is enabled, the gfh may execute frame forwarding-related picocode. general table handler (gth) there is one gth thread available in the epc. the gth executes tree management commands not available to other threads. the gth performs actions including hardware assist to perform tree inserts, tree deletes, tree aging, and rope management. the gth can process data frames like a gdh when there are no tree management functions to perform. general powerpc handler request (gph-req) there is one gph-req thread available in the epc. the gph-req thread processes frames bound to the embedded powerpc. work for this thread is the result of a re-enqueue action from another thread in the epc to the gpq queue. the gph-req thread moves data bound for the powerpc to the powerpc ? s mailbox (a memory area) and then notifies the powerpc that it has data to process. see section 10.8 mailbox communications and dram interface macro on page 382 for more information. general powerpc handler response (gph-resp) there is one gph-resp thread available in the epc. the gph-resp thread processes responses from the embedded powerpc. work for this thread is dispatched due to an interrupt initiated by the powerpc and does not use dispatch unit memory. all the information used by this thread is found in the embedded powerpc ? s mailbox. see section 10.8 mailbox communications and dram interface macro on page 382 for more information.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 186 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.2 dyadic protocol processor unit (dppu) this section describes the basic functionality of the dppu. two aspects of the dppu are discussed in detail in separate sections: section 7.3 beginning on page 194 discusses the operational codes (opcodes) for the core language processors (clps), and section 7.4 beginning on page 223 cover ? sthedppu ? s coproces- sors. each dppu consists of the following functional blocks, as illustrated in figure 7-2 (for np4gs3a (r1.1)) and figure 7-3 (for np4gs3b (r2.0)). two core language processors (clps) eight coprocessors a coprocessor data bus a coprocessor command bus (the coprocessor databus and coprocessor execution interfaces) a 4-kb shared memory pool (1 kb per thread) each clp supports two threads, so each dppu has four threads which execute the picocode that is used to forward frames, update tables, and maintain the network processor. each dppu interfaces with the following functional blocks of the epc: instruction memory dispatch unit control store arbiter (csa) completion unit (cu) hardware classifier interface and arbiter to the ingress and egress data stores control access bus (cab) arbiter debugging facilities counter manager policy manager ludeftable queue comp free queue semaphore manager (np4gs3b (r2.0))
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 187 of 554 figure 7-2. dyadic protocol processor unit functional blocks (np4gs3a (r1.1)) completion unit processor processor shared memory pool cpdi arbiter cpei arbiter checksum enqueue data store control access bus interface counter policy 4 cab arbiter counter manager policy manager ingress ds interface tree search engine coprocessor execution interface coprocessor data interface egress ds interface instruction memory interface instruction memory interface hardware classifier hardware classifier cab access core language core language dyadic protocol processor unit string copy
ibm powernp np4gs3 network processor preliminary embedded processor complex page 188 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.2.1 core language processor (clp) each dppu contains two clps. the clp executes the epc ? s core instruction set and controls thread swap- ping and instruction fetching. each clp is a 32-bit picoprocessor consisting of: 16 32-bit or 32 16-bit general purpose registers (gprs) per thread. (for more information, see ta b l e 7 - 1: core language processor address map on page 190 .) a one-cycle alu supporting an instruction set that includes: - binary addition and subtraction - bit-wise logical and, or, and not - compare - count leading zeros - shift left and right logical - shift right arithmetic - rotate left and right figure 7-3. dyadic protocol processor unit functional blocks (np4gs3b (r2.0)) processor processor shared memory pool cpdi arbiter cpei arbiter checksum string control access bus interface counter policy 4 cab arbiter counter manager policy manager tree search engine coprocessor coprocessor instruction memory interface instruction memory interface hardware classifier hardware classifier cab access core language core language dyadic protocol processor unit copy execution interface completion unit enqueue data store ingress ds interface egress ds interface dispatcher smdi data interface dispatcher arbiter shared memory data interface semaphore semaphore manager
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 189 of 554 - bit manipulation commands: set, clear, test, and flip - gpr transfer of halfword to halfword, word to word, and halfword to word with and without sign extensions - all instructions can be coded to run conditionally. this eliminates the need for traditional branch-and- test coding techniques, which improves performance and reduces the size of the code. all arithmetic and logical instructions can be coded to execute without setting alu status flags. management for handling two threads with zero overhead for context switching read-only scalar registers that provide access to the following information: - interrupt vectors -timestamps - output of a pseudo random number generator - picoprocessor status - work queue status (such as the ingress and egress data queues) - configurable identifiers (such as the blade identification) for more information, see table 7-1: core language processor address map on page 190 . 16-word instruction prefetch shared by each thread instruction execution unit that executes branch instructions, instruction fetch, and coprocessor access coprocessor data interface (cpdi) with the following features: - access from any byte, halfword, or word of a gpr to an array, or from an array to a gpr - access to coprocessor scalar registers - various sign, zero, and one extension formats - quadword transfers within the coprocessor arrays - quadword reset to zero of coprocessor arrays (np4gs3b (r2.0)) coprocessor execution interface (cpei) with the following features: - synchronous or asynchronous coprocessor operation - multiple coprocessor synchronization - synchronization and branch-on-coprocessor return code
ibm powernp np4gs3 network processor preliminary embedded processor complex page 190 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.2.1.1 core language processor address map figure 7-4. core language processor table 7-1. core language processor address map (page 1 of 3) name register (array) 1 number size (bits) access description pc x ? 00 ? 16 r program counter. address of the next instruction to be executed. alustatus x ? 01 ? 4r the current alu status flags: 3zero 2carry 1sign 0 overflow linkreg x ? 02 ? 16 r/w link register. return address for the most recent subroutine 1. a number in parentheses is the array number for this coprocessor. each array has a register number and an array number. alu instruction execution unit cpei interface cpdi interface instruction fetch interface instruction stack 8x32b flags control immediate data cpei cpdi instruction memory 128b interface gpr pool 16x32b gpr pool 16x32b cpei arbiter cpdi arbiter scalar registers coprocessor data interface
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 191 of 554 copstatus x ? 03 ? 10 r indicates whether the coprocessor is busy or idle. a coprocessor that is busy will stall the clp when the coprocessor command was executed synchronously or a wait command was issued for the coprocessor. 1 coprocessor is busy 0 coprocessor is idle coprtncode x ? 04 ? 10 r the definition of ok/ko is defined by the coprocessor. 1ok 0notok threadnum x ? 05 ? 5 r the thread number (0..31) stack_link_ptr x ? 06 ? 4 r the current pointer value into the clp link stack (np4gs3b (r2.0)) timestamp x ? 80 ? 32 r free-running, 1 ms timer randomnum x ? 81 ? 32 r random number for programmer ? suse intvector0 x ? 83 ? 32 r read-only copy of interrupt vector 0. reading this register has no effect on the actual interrupt vector 0. intvector1 x ? 84 ? 32 r read-only copy of interrupt vector 1. reading this register has no effect on the actual interrupt vector 1. intvector2 x ? 85 ? 32 r read-only copy of interrupt vector 2. reading this register has no effect on the actual interrupt vector 2. intvector3 x ? 86 ? 32 r read-only copy of interrupt vector 3. reading this register has no effect on the actual interrupt vector 3. idlethreads x ? 87 ? 32 r indicates that a thread is enabled and idle qvalid x ? 88 ? 32 r indicates status of the queues (valid or invalid). my_tb x ? 89 ? 6r my target blade. the blade number of the blade in which this network processor is currently residing (see section 13.13.9 my target blade address register (my_tb) on page 468) sw_defined_a x ? 8a ? 32 r software-defined register sw_defined_b x ? 8b ? 32 r software-defined register sw_defined_c x ? 8c ? 32 r software-defined register version_id x ? 8f ? 32 r contains the version number of the hardware. gprw0 x ? c0 ? 32 r general purpose register w0 gprw2 x ? c1 ? 32 r general purpose register w2 gprw4 x ? c2 ? 32 r general purpose register w4 gprw6 x ? c3 ? 32 r general purpose register w6 gprw8 x ? c4 ? 32 r general purpose register w8 gprw10 x ? c5 ? 32 r general purpose register w10 gprw12 x ? c6 ? 32 r general purpose register w12 gprw14 x ? c7 ? 32 r general purpose register w14 gprw16 x ? c8 ? 32 r general purpose register w16 gprw18 x ? c9 ? 32 r general purpose register w18 table 7-1. core language processor address map (page 2 of 3) name register (array) 1 number size (bits) access description 1. a number in parentheses is the array number for this coprocessor. each array has a register number and an array number.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 192 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.2.2 clp opcode formats the core instructions (opcodes) of the clp, their formats, and their definitions are discussed in detail in sec- tion 7.3 beginning on page 194 . 7.2.3 dppu coprocessors all data processing occurs in the eight dppu coprocessors. they are discussed in detail in section 7.4 begin- ning on page 223 . gprw20 x ? ca ? 32 r general purpose register w20 gprw22 x ? cb ? 32 r general purpose register w22 gprw24 x ? cc ? 32 r general purpose register w24 gprw26 x ? cd ? 32 r general purpose register w26 gprw28 x ? ce ? 32 r general purpose register w28 gprw30 x ? cf ? 32 r general purpose register w30 pgramstack0 x ? fc ? (0) 128 r/w entries in the program stack (used by the clp hardware to build instruction address stacks for the branch and link commands) pgramstack1 x ? fd ? (1) 128 r/w entries in the program stack (used by the clp hardware to build instruction address stacks for the branch and link commands) table 7-1. core language processor address map (page 3 of 3) name register (array) 1 number size (bits) access description 1. a number in parentheses is the array number for this coprocessor. each array has a register number and an array number.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 193 of 554 7.2.4 shared memory pool the 4-kb shared memory pool is used by all threads running in the dppu. each thread uses 1 kb and is subdivided into the following areas: table 7-2. shared memory pool quadword address owning coprocessor array 0-2 enqueue fcb page 1a 3 data store configuration quadword 4-6 enqueue fcb page 1b 7 clp stack 0 8-10 enqueue fcb page 2 11 clp stack 1 12-15 data store scratch memory 0 16-23 data store datapool 24-31 data store scratch memory 1 32-47 tse tree search results area 0 40-47 tse tree search results area 1 48-63 tse tree search results area 2 56-63 tse tree search results area 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 194 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3 clp opcode formats this section describes the core instructions (opcodes) of the clp, their formats, and their definitions. clp opcodes fall into four categories: control opcodes, alu opcodes, data movement opcodes, and coprocessor execution opcodes. shaded areas of the opcode show which bits uniquely identify the opcode. 7.3.1 control opcodes most control opcodes have field a condition field (cond) that indicates under what condition of the alu flags z (zero), c (carry), n (negative), and v (overflow) this command should be executed. the clp supports 15 different signed and unsigned conditions. table 7-3. condition codes (cond field) value alu flag comparison meaning 0000 z = 1 equal or zero 0001 z = 0 not equal or not zero 0010 c = 1 carry set 0011 c = 1 and z = 0 unsigned higher 0100 c = 0 or z = 1 unsigned lower or equal 0101 c = 0 unsigned lower 0110 - reserved 0111 don ? t care always 1000 n = 0 signed positive 1001 s = 1 signed negative 1010 n = v signed greater or equal 1011 z = 0 and n = v signed greater than 1100 z = 1 or (n/ = v) signed less than or equal 1101 n/ = v signed less than 1110 v = 1 overflow 1111 v = 0 no overflow
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 195 of 554 7.3.1.1 nop opcode the nop opcode executes one cycle of time and does not change any state within the processor. 7.3.1.2 exit opcode the exit opcode terminates the current instruction stream. the clp will be put into an idle state and made available for a new dispatch. the exit command is executed conditionally, and if the condition is not true, it will be equivalent to a nop opcode. pseudocode: if (cond) then clp_state_machine <- idle else nop end if type 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 nop 0 0 0 0 1 0 0000 0000 0000000000 000000 0 0 exit 0 0 0 1 0 000 cond 00000000000000000000 branch and link 0 0 0 1 1h 1 1 cond rt target16 return 0 0 0 1 1 0 0 0 cond 000000000000000 00000 branch register 0 0 0 1 1h 0 0 cond rt 00000000000 10000 branch pc relative 0 0 0 1 1 0 1 0 cond 0 0 0 0 disp16 branch reg+off 0 0 0 1 1 0 0 1 cond rt target16
ibm powernp np4gs3 network processor preliminary embedded processor complex page 196 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.1.3 branch and link opcode the branch and link opcode performs a conditional branch, adds one to the value of the current program counter, and places it onto the program stack. opcode fields rt, h, and target16 determine the destination of the branch. if rt is in the range of 1 - 15, it is the word address of a gpr register, the contents of which are used as the base of the branch destination. if rt is 0, the base of the branch destination will be 0x ? 0000 ? .the h field indicates which half of the gpr register is used as the base address. an h = 0 indicates the even half of the gpr register (high half of the gpr) and h = 1 indicates the odd half. the target16 field is added to the base to form the complete branch destination. if the linkptr register indicates that the branch and link will overflow the 16-entry program stack, a stack error occurs. pseudocode: if (cond) then -- put ia+1 onto the program stack if (linkptr=endofstack) then stackerr <- 1 else programstack(linkptr+1) <- pc + 1 end if -- load the ia with the branch target if (rt=0) then pc <- target16 elsif (h=0) then pc <- gpr(rt)(31:16) + target16 else pc <- gpr(rt)(15:0) + target16 end if else nop end if
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 197 of 554 7.3.1.4 return opcode the return opcode performs a conditional branch with the branch destination being the top of the program stack. if the linkptr register indicates that the stack is empty, a stack error occurs. pseudocode: if (cond) then if (linkptr=emptystack) then stackerr <- 1 else pc <- programstack(linkptr) end if else nop end if 7.3.1.5 branch register opcode the branch and link opcode performs a conditional branch. opcode fields rt and h determine the destination of the branch. the rt field is the word address of a gpr register of which the contents will be used as the branch destination. the h field indicates which half of the gpr register will be used. an h = 0 indicates the even half of the gpr register (high half of the gpr), and h = 1 indicates the odd half. pseudocode: if (cond) then if (h=0) then pc <- gpr(rt)(31:16) + target16 else pc <- gpr(rt)(15:0) + target16 end if else nop end if 7.3.1.6 branch pc relative opcode the branch pc relative opcode performs a conditional branch. opcode fields pc and disp16 determine the destination of the branch. the contents of the pc field are used as the base of the branch destination. the disp16 field will be added to the base to form the complete branch destination. pseudocode: if (cond) then pc <- pc + disp16 else nop end if
ibm powernp np4gs3 network processor preliminary embedded processor complex page 198 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.1.7 branch reg+off opcode the branch register plus offset opcode will perform a conditional branch. the opcode fields rt and target16 determine the destination of the branch. if rt is in the range of 1 - 15, it is the word address of a gpr register of which the contents of the high halfword (or even half) are used as the base of the branch destination. if rt is 0, the base of the branch destination is x ? 0000 ? . the target16 field is added to the base to form the com- plete branch destination. pseudocode: if (cond) then if (rt=0) then pc <- target16 else pc <- gpr(rt)(31:16) + target16 end if else nop end if
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 199 of 554 7.3.2 data movement opcodes the data movement opcodes are used to transfer data to and from the coprocessor ? s scalar registers and arrays. the opcodes support 23 options of direction, size, extension, and fill, represented by the ot5 field. data movement opcodes access the processor data interface (pdi). figure 7-5. ot5 field definition: loading halfword/word gprs from a halfword/word array gpr word register 012 ot5 field load odd halfword register from array halfword 0 array d d d d load even halfword register from array halfword d d d d 1 2 3 load word register from array halfword zero extended 0 s load word register from array halfword sign extended d d load word register from array word byte hw 0 1 6 (hex) 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 200 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-6. ot5 field definition: loading gpr byte from array byte gpr word register ot5 field load byte 3 of gpr from array byte 8 array d d load byte 2 of gpr from array byte 9 a b load byte 1 of gpr from array byte load byte 0 of gpr from array byte byte hw 0 1 d d d d d d (hex) 23 01
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 201 of 554 figure 7-7. ot5 field definition: loading gpr halfword/word from array byte gpr word register ot5 field load odd halfword gpr from array byte c array d d load even halfword gpr from array byte d e f load even halfword gpr from array byte zero extended load halfword gpr from array byte sign extended byte hw 0 1 d d d d 0 s d 0 d s 4 d d 5 d d 0 0 0 s s s load word gpr from array byte zero extended load word gpr from array byte sign extended (hex) 23 01
ibm powernp np4gs3 network processor preliminary embedded processor complex page 202 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-8. ot5 field definition: store gpr byte/halfword/word to array byte/halfword/word gpr word register ot5 field store byte 3 of gpr to array byte 18 array d d 19 1a 1b byte hw 0 1 d d d d d d d d d d 10 11 d d 16 storebyte2of gpr to array byte storebyte1of gpr to array byte storebyte0of gpr to array byte store odd halfword gpr to array halfword store even halfword gpr to array halfword store word gpr to array word (hex) 23 01
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 203 of 554 7.3.2.1 memory indirect opcode the memory indirect opcode transfers data between a gpr and a coprocessor array via a logical address in which the base offset into the array is contained in a gpr. the logical address consists of a coprocessor number, coprocessor array, base address, and an offset. the c# field indicates the coprocessor which will be accessed. the ca field indicates which array of the coprocessor will be accessed. the offset into the array is the contents of gpr, indicated by ra as a base plus the six bits of immediate offset, off6. ra is a half-word address in the range of 0 - 7. the x indicates whether or not the transaction will be in cell header skip mode. the memory indirect opcode is executed conditionally if the alu flags meet the condition represented by the cond field. the word address of the gpr register used to transfer or receive data from the array is indicated by the field r. the actual direction, size, extension format, and location of gpr bytes affected are indicated by the ot5 field. pseudocode: if (cond) then addr.coprocessor number <= c# addr.coprocessor array <= ca addr.array offset <= gpr(ra)+off6 addr.x_mode <= x if (ot5 = load) then gpr(r,ot5) <= array(addr) else array(addr) <= gpr(r,ot5) end if; end if type 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 memory indirect 0 0 1 ot5 cond r x ra c# ca off6 memory add indirect 0 1 0 ot5 cond r x ra 0 0 0 0 off8 memory direct 0 1 1ot5 cond rx 0 off2 c# ca off6 scalar access 0 1 1ot5 cond r0 100 c# cr scalar immed 1 1 0 1c# cr imm16 transfer qw 1 1 0 0 c#d cad 00 qwoffd 0000 c#s cas 0 1 qwoffs zero array 1 1 0 000 size cond boffset x000 c# ca 1 1qwoffs
ibm powernp np4gs3 network processor preliminary embedded processor complex page 204 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.2.2 memory address indirect opcode the memory address indirect opcode transfers data between a gpr and a coprocessor data entity (scalar or array) by mapping the coprocessor logical address into the base address held in the gpr indicated by ra. since not all of the coprocessors have arrays, maximum size arrays, or the maximum number of scalars, the address map will have many gaps. data access via this method is the same as access via the more logic- based method. the final address is formed by the contents of gpr indicated by ra plus off8 (eight bits). ra is a half-word address in the range of 0 - 7. the x indicates whether or not the transaction will be in cell header skip mode. the memory address indirect opcode is executed conditionally if the alu flags meet the condition repre- sented by the cond field. the word address of the gpr register used to transfer or receive data from the array is indicated by the field r. the actual direction, size, extension format, and location of gpr bytes affected are indicated by the ot5 field. pseudocode: if (cond) then address <= gpr(ra) + off8 addr.array_notscalar <= address(14) addr.coprocessor number <= address(13:10) addr.coprocessor array <= address(9:8) (if an array access) addr.scalar address <= address(7:0) (if a scalar access) addr.array offset <= address(7:0) (if an array access) addr.x_mode <= x if (ot5 = load) then if (addr.array_notscalar=1) then gpr(r,ot5) <= array(addr) else gpr(r,ot5) <= scalar(addr) end if; else if (addr.array_notscalar=1) then array(addr) <= gpr(r,ot5) else scalar(addr) <= gpr(r,ot5) end if; end if; end if
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 205 of 554 7.3.2.3 memory direct opcode the memory direct opcode transfers data between a gpr and a coprocessor array via a logical address that is specified in the immediate portion of the opcode. the logical address consists of a coprocessor number, coprocessor array, and an offset. the c# field indicates the coprocessor that is accessed. the ca field indi- cates which array of the coprocessor is accessed. the offset is the six bits of immediate field, off6. the x indi- cates whether or not the transaction will be in cell header skip mode. the memory direct opcode is executed conditionally if the alu flags meet the condition represented by the cond field. the word address of the gpr register used to transfer or receive data from the array is indicated by the r field. the actual direction, size, extension format, and location of gpr bytes affected are indicated by the ot5 field. pseudocode: if (cond) then addr.coprocessor number <= c# addr.coprocessor array <= ca addr.array offset <= off6 addr.x_mode <= x if (ot5 = load) then gpr(r,ot5) <= array(addr) else array(addr) <= gpr(r,ot5) end if; end if 7.3.2.4 scalar access opcode the scalar access opcode transfers data between a gpr and a scalar register via a logical address that con- sists of a coprocessor number and a scalar register number. the c# field indicates the coprocessor that is accessed. the scalar register number is indicated by the cr field. the scalar access opcode is executed con- ditionally if the alu flags meet the condition represented by the cond field. the word address of the gpr register used to transfer or receive data from the scalar register is indicated by the r field. the actual direc- tion, size, extension format, and location of gpr bytes affected are indicated by the ot5 field. pseudocode: if (cond) then addr.coprocessor number <= c# addr.scalar number <= ca if (ot5 = load) then gpr(r,ot5) <= scalar(addr) else scalar(addr) <= gpr(r,ot5) end if; end if
ibm powernp np4gs3 network processor preliminary embedded processor complex page 206 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.2.5 scalar immediate opcode the scalar immediate opcode writes immediate data to a scalar register via a logical address that is com- pletely specified in the immediate portion of the opcode. the logical address consists of a coprocessor num- ber and a coprocessor register number. the c# field indicates the coprocessor that is accessed. the cr field indicates the scalar register number. the scalar access opcode is executed conditionally if the alu flags meet the condition represented by the cond field. the data to be written is the imm16 field. pseudocode: if (cond) then addr.coprocessor number <= c# addr.scalar register number <= ca scalar(addr) <= imm16 end if 7.3.2.6 transfer quadword opcode the transfer quadword opcode transfers quadword data from one array location to another using one instruc- tion. the source quadword is identified by the c#s (coprocessor number), cas (source array number), and qwoffs (quadword offset into the array). the destination quadword is identified by the c#d, cad, and qwoffd.this transfer is only valid on quadword boundaries. pseudocode: saddr.coprocessor number <= c#s saddr.array number <= cas saddr.quadword offset <= qwoffs daddr.coprocessor number <= c#d daddr.array number <= cad daddr.quadword offset <= qwoffd array(daddr) <= array(saddr)
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 207 of 554 7.3.2.7 zero array opcode (np4gs3b (r2.0) only) the zero array opcode zeroes out a portion of an array with one instruction. the size of the zeroed-out por- tion can be a byte, halfword, word, or quadword. quadword access must be on quadword address bound- aries. accesses to a byte, halfword, or word can begin on any byte boundary and will have the same characteristics as any gpr-based write to the array. for example, if the array is defined to wrap from the end to the beginning, the zero array command wraps from the end of the array to the beginning. the c# field indi- cates the coprocessor that will be accessed. the ca field indicates which array of the coprocessor is accessed. the qwoff is the quadword offset into the array, and the boff is the byte offset into the array. the x indicates whether or not the transaction is in cell header skip mode. the opcode is executed conditionally if the alu flags meet the condition represented by the cond field. the actual size of the access is defined by the size field (byte = 00, halfword = 01, word = 10, quadword = 11). for quadword accesses boff should equal 0x0. pseudocode: if (cond) then addr.coprocessor number <= c# addr.array number <= ca addr.quadword offset <= qwoff addr.byte offset <= boff if size = 00 then array(addr) <= 0x00 if size = 01 then array(addr : addr+1) <= 0x0000 if size = 10 then array(addr : addr+3) <= 0x00000000 if size = 11 then array(addr : addr+15) <= 0x0000000000000000000000000000000 end if;
ibm powernp np4gs3 network processor preliminary embedded processor complex page 208 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.3 coprocessor execution opcodes the coprocessor execution opcodes are used to initiate operations or synchronize operations with the copro- cessors. the execution type opcodes initiate accesses on the processor execution interface (pei). a com- mand to a coprocessor consists of six bits of coprocessor operation and 44 bits of coprocessor arguments. the definition of the operation and argument bits is coprocessor dependent. a coprocessor can operate synchronously or asynchronously. synchronous operation means the execution of the thread initiating the coprocessor command stalls until the coprocessor operation is complete. asyn- chronous operation means the executing thread will not stall when a coprocessor execution command is issued, but rather can continue operating on the instruction stream. it is important to re-synchronize a copro- cessor command that was issued asynchronously before using resources that the coprocessor needs for execution of the command. this can be done with the ? wait ? coprocessor execution opcodes. the clp runs the instruction stream of two threads in which only one thread can actually execute in a given cycle. the clp thread that owns priority cannot be stalled by the execution of the non-priority thread. priority is granted to the only active thread in a clp or to the thread that is given priority by the other thread. the coprocessor execution opcodes indicated by the p bit in the following opcode map can give up the priority status of the thread. type 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 execute direct 1 1 1 1c#p cpop a 0 immed16 execute indirect 1 1 1 1c#p cpop a 1r immed12 execute direct conditional 1 1 1 0c# condop 0 1 immed16 execute indirect conditional 1 1 1 0c# condop 1 1r immed12 wait 1 1 1 00000p000 000 0 mask16 wait and branch 1 1 1 0c#p000 10ok 0 target16
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 209 of 554 7.3.3.1 execute direct opcode the execute direct opcode initiates a coprocessor command in which all of the operation arguments are passed immediately to the opcode. the c# field indicates which coprocessor will execute the command. the cpopfield is the coprocessor operation field. the immed16 field contains the operation arguments that are passed with the command. the a field indicates whether the command should be executed asynchronously. the p field indicates if the thread should give up priority. pseudocode: exe.coprocessor number <= c# exe.coprocessor operation <= cpop exe.coprocessor arguments <= 0000000000000000000000000000 & immed16 coprocessor <= exe if a=1 then pc <= pc+1 else pc <= stall end if if p=1 then priorityowner(other thread)<= true else priorityowner(other thread)<= priorityowner(other thread) end if;
ibm powernp np4gs3 network processor preliminary embedded processor complex page 210 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.3.2 execute indirect opcode the execute indirect opcode initiates a coprocessor command in which the operation arguments are a combi- nation of a gpr register and an immediate field. the field c# indicates which coprocessor the command is to be executed on. the field cpop is the coprocessor operation field. the r field is the gpr register to be passed as part of the operation arguments. the immed12 field contains the immediate operation arguments that are passed. the a field indicates whether the command should be executed asynchronously. the p field indicates if the thread should give up priority. pseudocode: exe.coprocessor number <= c# exe.coprocessor operation <= cpop exe.coprocessor arguments <= immed12 & gpr(r) coprocessor <= exe if a=1 then pc <= pc+1 else pc <= stall end if if p=1 then priorityowner(other thread)<= true else priorityowner(other thread)<= priorityowner(other thread) end if; 7.3.3.3 execute direct conditional opcode the execute direct conditional opcode is similar to the execute direct opcode except that the execute direct conditional opcode command can be issued conditionally based on the cond field. to make room in the opcode for the cond field, the coprocessor opcode field (op) is shortened to two bits. because the high order four bits of the coprocessor operation are assumed to be zeros, conditional operations are restricted to the lower four commands of a coprocessor. he command is assumed to be synchronous because the opcode does not have a bit to indicate whether it is asynchronous or synchronous. priority cannot be released with this opcode. pseudocode: if (cond) then exe.coprocessor number <= c# exe.coprocessor operation <= 0000&op exe.coprocessor arguments <= 0000000000000000000000000000 & immed16 coprocessor <= exe end if
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 211 of 554 7.3.3.4 execute indirect conditional opcode the execute indirect conditional opcode is similar to the execute indirect opcode except that the execute indi- rect command can be issued conditionally based on the cond field. to make room in the opcode for the cond field, the coprocessor opcode field (op) is shortened to two bits. because the high order four bits of the copro- cessor operation are assumed to be 0s, conditional operations are restricted to the lower four commands of a coprocessor. the command is assumed to be synchronous because the opcode does not have a bit to indi- cate whether it is asynchronous or synchronous. priority cannot be released with this opcode. pseudocode: if (cond) then exe.coprocessor number <= c# exe.coprocessor operation <= 0000&op exe.coprocessor arguments <= immed12 & gpr(r) coprocessor <= exe end if 7.3.3.5 wait opcode the wait opcode synchronizes one or more coprocessors. the mask16 field (see the register bit list table in section 7.3.3 on page 208) is a bit mask (one bit per coprocessor) in which the bit number corresponds to the coprocessor number. the thread stalls until all coprocessors indicated by the mask complete their operations. priority can be released with this command. pseudocode: if reduction_or(mask16(i)=coprocessor.busy(i)) then pc <= stall else pc <= pc + 1 end if if p=1 then priorityowner(other thread)<= true else priorityowner(other thread)<= priorityowner(other thread) end if;
ibm powernp np4gs3 network processor preliminary embedded processor complex page 212 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.3.6 wait and branch opcode the wait and branch opcode synchronizes with one coprocessor and branch on its one bit ok/ko flag. this opcode causes the thread to stall until the coprocessor represented by c# is no longer busy. the ok/ko flag is then compared with the field ok. if they are equal, the thread branches to the address in the target16 field. priority can be released with this command. pseudocode: if coprocessor.busy(c#)=1 then pc <= stall elsif coprocessor.ok(c#)=ok then pc <= target16 else pc <= pc+1 end if if p=1 then priorityowner(other thread)<= true else priorityowner(other thread)<= priorityowner(other thread) end if;
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 213 of 554 7.3.4 alu opcodes alu opcodes are used to manipulate gpr registers with arithmetic and logical functions. all of these opcodes execute in a single cycle. these opcodes are also used to manipulate the alu status flags used in condition execution of opcodes. all alu opcodes are executed conditionally based upon the cond field. 7.3.4.1arithmeticimmediateopcode the arithmetic immediate opcode performs an arithmetic function on a gpr register in which the second operand is a 12-bit immediate value. the word gpr operand and destination gpr are specified in the r field and the immediate operand is represented by imm4&imm8. the actual portion of the gpr and extension of the immediate operand are shown in the ot3i table. the arithmetic function performed is given in the aluop field. this aluop functions of and, or, xor, tst, and compare are not used with this opcode and have a different opcode when the second operand is an immediate value. arithmetic opcodes can be performed without changing the alu status flags if the i field is a 1. pseudocode: if (cond) then alu.opr1 <= gpr(r,ot3i) alu.opr2 <= ot3i(immed4&immed8) gpr(r,ot3i) <= aluop.result(alu.opr1,alu.opr2) if (i=0) then alustatus <= aluop.flags(alu.opr1, alu.opr2) end if end if type 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 arithmetic immediate 1 0 0 0 i ot3i cond r imm12(11:8) aluop imm12(7:0) logical immediate 1 0 0 1 i h lop cond r immed16 compare immediate 1 0 1 0 0 0 ot2i cond r immed16 load immediate 1 0 1 1 ot4i cond r immed16 arithmetic/ logical register 1 1 0 0 i ot3r cond rd rs aluop 0 0 0 0hnm count leading zeros 1 0 1 0 1hdw i cond rd rs 0 0 0 0 0 0 0 0 0 h n 0
ibm powernp np4gs3 network processor preliminary embedded processor complex page 214 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 table 7-4. aluop field definition aluop function pseudocode flags modified zcnv 0000 add result = opr1 + opr2 x x x x 0001 add w/carry result = opr1 + opr2 + c x x x x 0010 subtract result = opr1 - opr2 x x x x 0011 subtract w/carry result = opr1 - opr2 - c x x x x 0100 xor result = orp1 xor opr2 x x 0101 and result = opr1 and opr2 x x 0110 or result = opr1 or opr2 x x 0111 shift left logical result = opr1 <-opr2 , fill with 0s c = c when opr2 = 0 else c = 0 when opr2 > n else c = opr1(n - opr2); where n = size of opr1 xxx 1000 shift right logical result = fill with 0, opr1 ->opr2 c = c when opr2 = 0 else c = 0 when opr2 > n else c = opr1(opr2 - 1); where n = size of opr1 xxx 1001 shift right arithmetic result = fill with s, opr1 ->opr2 c = c when opr2 = 0 else c = opr1(n - 1) when opr2 > n else c = opr1(opr2 - 1); where n = size of opr1 xxx 1010 rotate right result = fill with opr1, opr1 ->opr2 c = c when opr 2 = 0 else c= opr1(modn(opr2 - 1)); where n = size of opr1 xxx 1011 compare opr1 - opr2 x x x x 1100 test opr1 and opr2 x x 1101 not result = not(opr1) x x 1110 transfer result = opr2 1111 reserved reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 215 of 554 figure 7-9. ot3i field definition gpr register ot3i field odd halfword gpr, 0 imm12 imm12 imm12 1 2 3 0 s byte hw 0 1 imm12 0 0 0 imm12 1 imm12 s s immediate field with 0 extend even halfword gpr, immediate field with 0 extend odd halfword gpr, immediate field with sign extend even halfword gpr, immediate field with sign extend word gpr, immediate field with 0 extend word gpr, immediate field with sign extend 01 2 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 216 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.4.2 logical immediate opcode the logical immediate opcode performs the logical functions and, or, xor, and test on a gpr register in which the second operand is a 16-bit immediate value. the r field specifies the word gpr operand and desti- nation gpr, and the h field specifies the even (h = 0) or odd (h = 1) halfword of this gpr. the immediate operand is represented by imm16. the arithmetic function performed is given in the lop field. logical opcodes can be performed without changing the alu status flags if the i field is a 1. pseudocode: if (cond) then alu.opr1 <= gpr(r:h) alu.opr2 <= immed16 gpr(r:h) <= lop.result(alu.opr1,alu.opr2) if (i=0) then alustatus <= lop.flags(alu.opr1, alu.opr2) end if end if 7.3.4.3 compare immediate opcode the compare immediate opcode performs the compare functions on a gpr register in which the second operand is a 16-bit immediate value. the source gpr operand and destination gpr are specified in the r field, and the immediate operand is represented by imm16. the actual portion of the gpr and extension of the immediate operand are shown in figure 7-10 . the compare immediate opcode always changes the alu status flags. pseudocode: if (cond) then alu.opr1 <= gpr(r,ot2i) alu.opr2 <= ot2i(immed16) alustatus <= compare(alu.opr1, alu.opr2) end if table 7-5. lop field definition lop function pseudocode flags modified zcnv 00 xor result = orp1 xor opr2 x x 01 and result = opr1 and opr2 x x 10 or result = opr1 or opr2 x x 11 test opr1 and opr2 x x
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 217 of 554 7.3.4.4 load immediate opcode the operation arguments of the load immediate opcode are a combination of a gpr register and a 16-bit immediate field. the source gpr operand and destination gpr are specified by the r field and the immedi- ate operand is represented by imm16. the actual portion of the gpr and extension of the immediate oper- and are shown in figure 7-11 and figure 7-12 . load immediate opcode never changes the alu status flags if executed. pseudocode: if (cond) then gpr(r,ot4i) <= ot4i(immed16) end if figure 7-10. ot2i field definition: compare halfword/word immediate gpr register ot2i field compare odd gpr register with immediate data 0 imm16 imm16 compare even gpr register with immediate data imm16 imm16 1 2 3 compare word gpr register with immediate data zero extend 0 s compare word gpr register with immediate data sign extend byte hw 0 1 01 2 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 218 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-11. ot4i field definition load immediate halfword/word gpr register ot5 field load odd halfword gpr from immediate data 0 imm16 load even halfword gpr from immediate data 1 load word gpr from immediate data zero extended load word gpr from immediate data 0 postpend load word gpr from immediate data byte hw 0 1 2 6 imm16 imm16 imm16 4 imm16 1 extended 5 imm16 load word gpr from immediate data 1 postpend 3 imm16 load word gpr from immediate data sign extended s 1 1 0 0 01 2 3
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 219 of 554 figure 7-12. ot4i field definition: load immediate byte ot4i field load gpr byte 3 from low byte of immediate data 8 d load gpr byte 2 from low byte of 9 a b load gpr byte 1 from low byte of immediate data load gpr byte 0 from low byte of immediate data byte hw 0 1 d d d immediate data gpr register 01 2 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 220 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.4.5 arithmetic register opcode the arithmetic register opcode performs an arithmetic function on a gpr register in which the second oper- and is also a gpr register. the first gpr operand and the destination gpr are specified by the word address r1 and the specific portion to be used is encoded in ot3r. the second operand source is represented by word address r2 with the h field determining the even or odd halfword if the ot3r field indicates a halfword is needed. if the aluop is a shift or rotate command, the second operand source is used as the amount of the shift or rotate. otherwise, ot3r indicates the relationship between the two operands. if the aluop is a logical operation (and, or, xor, test), the second operand source can then further be modified by the m and n fields. the m field is the mask field and, if active, creates a 1-bit mask in which the ? 1 ? bit is represented by the second operand source. the n field is the invert field and, if active, will invert the second operand source. ifboththemandnfieldsare ? 1 ? , the mask is created before the inversion. table 7-6 show how to use these fields to create other operations from the basic logic commands. the arithmetic function performed is given in the aluop field. arithmetic opcodes can be performed without changing the alu status flags if the ? i ? field is a 1. pseudocode: if (cond) then alu.opr1 <= gpr(r1,ot3r) alu.opr2 <= gpr(r2,h) if m=1 then alu.opr2 <= bitmask(alu.opr2) end if if n=1 then alu.opr2 <= not(alu.opr2) end if gpr(r2,ot3i) <= aluop.result(alu.opr1,alu.opr2) if (i=0) then alustatus <= aluop.flags(alu.opr1, alu.opr2) end if end if table 7-6. arithmetic opcode functions function aluop m n bit clear and 1 1 bit set or 1 0 bit flip xor 1 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 221 of 554 figure 7-13. ot3r field definition gpr (r1) register ot3r field odd halfword of 0 gpr2 even halfword of gpr r1 is operand1 gpr r2 is a halfword gpr2 gpr2 1 2 3 word gpr r1 is operand 1 zero extended 0 s word gpr r1 is operand 1 gpr r2 is halfword zero extend gpr2 word gpr r1 is operand 1 gpr r2 is a word byte hw 0 1 6 gpr r1 is operand1 gpr r2 is a halfword gpr r2 is halfword operand gpr2 01 2 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 222 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.3.4.6 count leading zeros opcode the count leading zeros opcode returns the number of zeros from left to right until the first 1-bit is encoun- tered. this operation can be performed on a halfword or word gpr register. there is a second variation of this command in which the return value of this command is the bit position of the first ? 1 ? from left to right in the gpr. the gpr to be analyzed is determined by the r fields. if the gpr is a halfword instruction, the h field indicates whether the even or odd half is used. the destination register is always a halfword register and is represented by the rd field and halfword indicator ? hd ? . the w field indicates whether the operation is a word or halfword. the n field determines if the command counts the number of leading zeros (n = 0) or if the command returns the bits position of the first one (n = 1). the only flag for this command is the overflow flag, which is set if the gpr being tested contains all zeros. the setting of this flag can be inhibited if the i field is a one. pseudocode: if (cond) then alu.opr1 <= gpr(rs,h) alu.result <= count_zero_left_to_right(alu.opr1) if n=1 then gpr(rd,hd) <= not(alu.result) else gpr(rd,hd) <= alu.result end if if (i=0) then alustatus <= aluop.flags(alu.opr1, alu.opr2) end if end if
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 223 of 554 7.4 dppu coprocessors each dppu coprocessor is a specialized hardware assist engine that runs in parallel with the two clps and performs functions that would otherwise require a large amount of serialized picocode. dppu functions include modifying ip headers, maintaining flow information used in flow control algorithms, accessing internal registers via the cab, maintaining counts for flow control and for standard and proprietary management infor- mation blocks (mib), and enqueueing frames to be forwarded. the dppu coprocessors are: a thread ? s address space is a distributed model in which registers and arrays reside within the coprocessors (each coprocessor maintains resources for four threads and, for address mapping purposes, the clp is considered to be a coprocessor). each coprocessor can have a maximum of 252 scalar registers and four arrays. the address of a scalar register or array within the dppu becomes a combination of the coprocessor number and the address of the entity within the coprocessor. likewise, the coprocessor instruction is a combination of the coprocessor number and the coprocessor opcode. the epc coprocessors are numbered as shown in table 7-7 . the number is used in accesses on both the coprocessor execute and data interfaces. the tse is mapped to two coprocessor locations so that a thread can execute two searches simultaneously. tree search engine see section8. treesearchengine on page 289. data store see section 7.4.2 beginning on page 224. control access bus (cab) interface see section 7.4.3 beginning on page 239. enqueue see section 7.4.4 beginning on page 242. checksum see section 7.4.5 beginning on page 256. string copy see section 7.4.6 beginning on page 261. policy see section 7.4.7 beginning on page 262. counter see section 7.4.8 beginning on page 263. semaphore see section 7.4.9 beginning on page 266.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 224 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.4.1 tree search engine coprocessor the tree search engine (tse) coprocessor has commands for tree management, direct access to the control store (cs), and search algorithms such as full match (fm), longest prefix match (lpm), and software- managed tree (smt). for complete information, see section8. treesearchengine on page 289. 7.4.2 data store coprocessor the data store coprocessor provides an interface between the epc and the ingress data store, which contains frames that have been received from the media, and the egress data store, which contains reas- sembled frames received from the switch interface. the data store coprocessor also receives configuration information during the dispatch of a timer event or interrupt. each thread supported by the data store coprocessor has one set of scalar registers and arrays defined for it. the scalar registers control the accesses between the shared memory pool and the ingress and egress data stores. the data store coprocessor maintains them. the arrays are defined in the shared memory pool (see 7.2.4 shared memory pool on page 193): the datapool, which can hold eight quadwords two scratch memory arrays, which hold eight and four quadwords respectively the configuration quadword array, which holds port configuration data dispatched to the thread. the shared memory pool arrays function as a work area for the data store coprocessor: instead of reading or writing small increments (anything less than a quadword, or 16 bytes) directly to a data store, a larger amount (one to four quadwords per operation) of frame data is read from the data store into these shared memory pool arrays or from the these arrays into the data store. the data store coprocessor has nine commands available to it. these commands are detailed in section 7.4.2.2 data store coprocessor commands on page 230. table 7-7. coprocessor instruction format coprocessor number coprocessor 0 core language processor (clp) 1 data store interface 2 tree search engine 0 3 tree search engine 1 4cabinterface 5 enqueue 6 checksum 7 string copy 8 policy 9 counter 10 reserved 11 semaphore
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 225 of 554 7.4.2.1 data store coprocessor address map table 7-8. data store coprocessor address map (a thread ? s scalar registers and arrays that are mapped within the data store coprocessor) name register (array) 1 number size (bits) access description dsa x ? 00 ? 19 r/w address of ingress or egress data store. used in all commands except ? read more ? and ? dirty ? . (ingress uses the least significant 11 bits and egress uses all 19 bits for the address.) lma x ? 01 ? 6r/w quadword address of shared memory pool. used in all commands except ? read more ? .(see 7.2.4 shared memory pool on page 193.) ccta x ? 02 ? 19 r/w current address of the ingress data buffer or the egress twin that was dispatched to the thread into the datapool. used in ? read more ? and ? dirty ? commands. the value is unitized at dispatch and updated on ? read more ? commands that pass though cell or twin boundaries. nqwa x ? 03 ? 3r/w indicates the qw address used for both the datapool and the qw in the datastore buffer/twin. see edirty (update egress dirty quad- words) command on page 237 , idirty (update ingress dirty quad- words) command on page 238 , rdmoree (read more quadword from egress) command on page 235 , and rdmorei (read more quadword from ingress) command on page 236 for details. iprotocoltype x ? 04 ? 16 r layer 3 protocol identifier set by the hardware classifier (see 7.7 hardware classifier on page 275). dirtyqw x ? 05 ? 8r/w quadwords in the datapool that have been written and therefore may not be equivalent to the corresponding data store data. used by the dirty update commands to write back any modified data into the corre- sponding data store. bci2byte x ? 06 ? 14/20 r/w a special-purpose register. the picocode running in the clp writes a 20-bit bci value, but when the register is read, it returns the 14-bit byte count represented by the bci. disp_dsu x ? 07 ? 2r/w initialized during dispatch to contain the same value as the dsu field in the egress fcbpage. this register is used by egress write com- mands to determine which egress data store the data store copro- cessor should access when writing data. this register has no meaning for ingress frames. disp_dsusel x ? 08 ? 1r/w initialized during dispatch to contain the same value as the dsu_sel field in the egress fcbpage. this register is used by egress read commands to determine which egress data store the data store coprocessor should access when reading data. this register has no meaning for ingress frames. disp_ingress x ? 09 ? 1r dispatched frame ? stype 0 egress frame 1 ingress frame configqw x ? fc ? (0) 128 r/w port configuration table entry for this frame scratchmem0 x ? fd ? (1) 512 r/w user defined array (use to store temporary information, build new frames, and so on) scratchmem1 x ? fe ? (2) 1024 r/w user defined array (use to store temporary information, build new frames, and so on) datapool x ? ff ? (3) 1024 r/w contains frame data from the dispatch unit 1. a number in parentheses is the array number for this coprocessor. each array has a register number and an array number.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 226 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 the datapool and data store access upon frame dispatch, the dispatch unit automatically copies the first n quadwords of a frame from the data store into the first n quadword positions of the datapool. the value of n is programmable in the port config- uration memory. typically, values of n are as follows: the read more commands (rdmorei and rdmoree) assist the picocode ? s reading of additional bytes of a frame by automatically reading the frame data into the datapool at the next quadword address and wrapping automatically to quadword 0 when the boundary of the datapool is reached. the picocode can also read or write the ingress and egress data stores at an absolute address, independent of reading sequential data after a dispatch. the datapool for ingress frames for an ingress dispatch, the n quadwords are stored in the datapool at quadword-address 0, 1, ? n-1. each quadword contains raw frame-bytes, (there are no cell header or frame headers for ingress frames in the datapool). after an ingress dispatch, the datapool contains the first n*16 bytes of the frame, where the first byte of the frame has byte address 0. the ingress datapool byte address definitions are listed in table 7-9 . when reading more than n quadwords of a frame (using the rdmorei command), the hardware automati- cally walks the data store ? s buffer control block (bcb) chain as required. quadwords read from the data store are written to the datapool at consecutive quadword-locations, starting at quadword address n (where n is the number of quadwords written to the datapool by the dispatch unit during frame dispatch). the quad- word-address wraps from 7 - 0. therefore, the picocode must save quadword 0 in case it is required later (for example, the quadword can be copied to a scratch array). the ingress data store can also be written and read with an absolute address. the address is a bcb address. n = 4 ingress frame dispatch n = 2 egress unicast frame dispatch n = 4 egress multicast frame dispatch n = 0 interrupts and timers table 7-9. ingress datapool byte address definitions quadword byte address 0 0123456789101112131415 1 16171819202122232425262728293031 2 32333435363738394041424344454647 3 48495051525354555657585960616263 4 64656667686970717273747576777879 5 80818283848586878889909192939495 6 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 7 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 227 of 554 reading from an absolute address can be used for debugging and to inspect the contents of the ingress ds. for absolute ingress data store access (reads and writes), the picocode must provide the address in the ingress data store, the quadword address in the datapool, and the number of quadwords to be transferred. figure 7-14 shows an example of a frame stored in the ingress data store: the datapool for egress frames for an egress dispatch, the n quadwords are stored in the datapool at a starting quadword-address that is determined by where the frame header starts in the twin, which in turn depends on how the frame is packed in the switch cell and stored in the egress data store (see figure 7-15 ). general-purpose register (gpr) r0 is initialized during dispatch as shown in table 7-10: egress frames datapool quadword addresses on page 228. gpr r0 can be used as an index into the datapool so that the variability of the location of the start of the frame in the datapool is transparent to the picocode. figure 7-14. a frame in the ingress data store quadword a quadword b quadword c quadword d frame quadword a quadword b quadword c quadword d data quadword a quadword b quadword c quadword d frame data frame data cell address a0 cell address a1 cell address a2 ingress data store a1 a2 a0 (current buffer) a2 a1 bcbs
ibm powernp np4gs3 network processor preliminary embedded processor complex page 228 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-15. frame in the egress data store (illustrating the effects of different starting locations) table 7-10. egress frames datapool quadword addresses frame start in twin buffer frame quadword address in the datapool gpr r0 quadword a 0 0 quadword b 1 10 quadword c 2 26 quadword d 3 42 quadword a quadword b quadword c quadword d start quadword a ch ch ch ch quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d start quadword b ch ch ch ch quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d fh fh quadword a quadword b quadword c quadword d start quadword c ch ch ch ch quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d fh quadword a quadword b quadword c quadword d start quadword d ch ch ch ch quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d fh quadword a quadword b quadword c quadword d quadword a quadword b quadword c quadword d
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 229 of 554 the relationship between quadwords a, b, c, and d in the twin buffer and the location of the quadword in the datapool is always maintained. that is, quadword a is always stored at quadword address 0 or 4, quadword b is always stored at quadword address 1 or 5, quadword c is always stored at quadword address 2 or 6, and quadword d is always stored at quadword address 3 or 7. in contrast with the ingress side, the egress datapool contains cell headers. when the exact content of the twin-buffer is copied into the datapool, it may include the 6-byte np4gs3 cell header if the quadword being copied comes from quadword a in the twin buffer. the datapool can be accessed in two modes: normal datapool access: accesses all bytes in the datapool, including the 6-byte cell header. for example, it can be used for guided frames where information in the cell header may be important, or for debugging and diagnostics. cell header skip datapool access: automatically skips the cell header from the datapool. the hardware assumes a cell header to be present at quadword-address 0 and 4. for example, accessing datapool[2] accesses the byte with phys- ical address 8, datapool[115] accesses the byte with physical address 127, and datapool[118] also accesses the byte with physical address 8. the maximum index that can be used for this access mode is 231. this mode is shown in table 7-11 . fewer frame-bytes may be available in the datapool for the egress due to the presence of cell headers. table 7-12 shows the number of frame-bytes in the datapool after a frame dispatch. table 7-11. datapool byte addressing with cell header skip quadword byte address 0 ?????? 0 116 1 117 2 118 3 119 4 120 5 121 6 122 7 123 8 124 9 125 1 10 126 11 127 12 128 13 129 14 130 15 131 16 132 17 133 18 134 19 135 20 136 21 137 22 138 23 139 24 140 25 141 2 26 142 27 143 28 144 29 145 30 146 31 147 32 148 33 149 34 150 35 151 36 152 37 153 38 154 39 155 40 156 41 157 3 42 158 43 159 44 160 45 161 46 162 47 163 48 164 49 165 50 166 51 167 52 168 53 169 54 170 55 171 56 172 57 173 4 ?????? 58 174 59 175 60 176 61 177 62 178 63 179 64 180 65 181 66 182 67 183 5 68 184 69 185 70 186 71 187 72 188 73 189 74 190 75 191 76 192 77 193 78 194 79 195 80 196 81 197 82 198 83 199 6 84 200 85 201 86 202 87 203 88 204 89 205 90 206 91 207 92 208 93 209 94 210 95 211 96 212 97 213 98 214 99 215 7 100 216 101 217 102 218 103 219 104 220 105 221 106 222 107 223 108 224 109 225 110 226 111 227 112 228 113 229 114 230 115 231
ibm powernp np4gs3 network processor preliminary embedded processor complex page 230 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 for example, when 24 bytes of frame data are always needed, the port configuration memory must be programmed with the n (number of quadwords to dispatch) equal to two. when 32 bytes are always needed, the number of quadwords to dispatch must be set to three. after a dispatch, picocode can use the rdmoree command when more frame data is required. one, two, three, or four quadwords can be requested. consult the ? guaranteed ? column in table 7-12 to translate the necessary number of bytes into the greater number of quadwords that must be read. for example, if the pico- code must dig into the frame up to byte 64, five quadwords are required. in this example, the number of quad- words specified with the rdmoree command equals 5 - n, where n is the number of quadwords initially written in the datapool by the dispatch unit. as a general rule, each set of four quadwords provides exactly 58 bytes. 7.4.2.2 data store coprocessor commands the data store coprocessor provides the following commands: table 7-12. number of frame-bytes in the datapool number of quadwords read start quadword a start quadword b start quadword c start quadword d guaranteed 1 10161616 10 2 26323226 26 3 42484242 42 4 58585858 58 5 68747474 68 6 84909084 84 7 100 106 100 100 100 8 116 116 116 116 116 command opcode description wreds 0 write egress data store. enables the clp to read data from one of the arrays in the shared memory pool (datapool and scratch memory array) and write it to the egress data store (in multiples of quad- words only). for more information, see wreds (write egress data store) command on page 232. rdeds 1 read egress data store. enables the clp to read data from the egress data store and write it into one of the arrays in the shared memory pool (in multiples of quadwords only). for more information, see rdeds (read egress data store) command on page 232. wrids 2 write ingress data store. enables the clp to write data to the ingress data store (in multiples of quad- words only). for more information, see wrids (write ingress data store) command on page 234. rdids 3 read ingress data store. enables the clp to read data from the ingress data store (in multiples of quadwords only). for more information, see rdids (read ingress data store) command on page 234.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 231 of 554 note: the egress data store has a longer access time than the ingress data store. for better performance, as much frame parsing should be done on the ingress side as possible. rdmoree 5 read more frame data from the egress data store. a hardware assisted read from the egress data store. rdmoree continues reading the frame from where the dispatch or last ? read more ? command left off and places the data into the datapool. as data is moved into the datapool, the hardware tracks the current location in the frame that is being read and captures the link pointer from the twin buffers in order to determine the address of the next twin buffer. this address is used by the hardware for subse- quent rdmoree requests until the twin is exhausted and the next twin is read. since the datapool is essentially a map of a twin's content, the frame data might wrap within the datapool; the picocode keeps track of the data ? s location within the datapool. for more information, see edirty (update egress dirty quadwords) command on page 237 rdmorei 7 read more frame data from the ingress data store. a hardware assisted read from the ingress data store. rdmorei continues reading the frame from where the dispatch or last ? read more ? command left off and places the data into the datapool. as data is moved into the datapool, the hardware tracks the current location in the frame that is being read and captures the link maintained in the buffer control block area in order to determine the address of the frame ? s next data buffer. this address is used by the hardware for subsequent rdmorei requests until the data buffer is exhausted and the next buffer is read. the picocode keeps track of the frame data ? s location within the datapool. for more information, see on page 236 leasetwin 8 lease twin buffer. returns the address of a free twin buffer. use this command when creating new data in the egress data store. for more information, see 7.4.3 control access bus (cab) coprocessor on page 239 edirty 10 update dirty quadword egress data store. the coprocessor keeps track of the quadwords within the current twin that have been modified in the datapool array. edirty enables the clp to write only the ? dirty ? data back to the egress data store (in multiples of quadwords only). this command is only valid within the datapool array and for the buffer represented by the scalar register current cell/twin address. for more information, see edirty (update egress dirty quadwords) command on page 237 idirty 12 update dirty quadword ingress data store. the coprocessor keeps track of the quadwords within the current buffer that have been modified in the datapool array. idirty enables the clp to write only the ? dirty ? data back to the ingress data store (in multiples of quadword units only). this command is only valid within the datapool array and for the buffer represented by the scalar register current cell/twin address. for more information, see idirty (update ingress dirty quadwords) command on page 238 command opcode description
ibm powernp np4gs3 network processor preliminary embedded processor complex page 232 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 wreds (write egress data store) command wreds writes to an absolute address in the egress data store. it can be specified if the quadwords are written to ds0, ds1, both ds0 and ds1, or if the decision is made automatically by information in the dispatch dsu register. the data store address can be any address in the egress data store, but the picocode must ensure that no twin overflow occurs during writing. for example, it is not a good idea to make the lma point to the last quad- word in a 64-byte buffer and set nrofquadword to four. rdeds (read egress data store) command rdeds reads from an absolute address in the egress data store. it can be specified if the quadwords are read from ds0 or ds1, or if this decision is made automatically by information in the dispatch dsu register. table 7-13. wreds input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be written: 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords dscontrol 2 imm16(3..2) imm12(3..2) defines if the quadwords are written to ds0, ds1 or both: 00 writes to the default dsu 01 writes to ds0 10 writes to ds1 11 writes to ds0 and ds1 disp_dsu 2 r the default dsu, initialized at dispatch dsa 19 r data store address. the target twin address in the egress data stores. the starting qw destination within the twin is determined by the 3 low-order bits of the lma. 000 - 011 qw 0-3 of the first buffer of the twin. 100 - 111 qw 0-3 of the second buffer of the twin. lma 6 r local memory address. the quadword source address in the shared memory pool (see 7.2.4 shared memory pool on page 193). table 7-14. wreds output operand source name size direct indirect description lma 6 r local memory address. the lma will be the input value of the lma incremented by the number of quadword transfers com- pleted by the command.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 233 of 554 when dscontrol is set to 00, the quadwords are read from the default ds, which is determined based on the dispatch dsusel field. table 7-15. rdeds input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be read: 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords dscontrol 2 imm16(3..2) imm12(3..2) defines if the quadwords are read from ds0 or ds1: 00 reads from the default ds 01 reads from ds0 10 reads from ds1 11 reserved disp_dsusel 1 r indicates the default ds chosen at dispatch. 0ds0 1ds1 dsa 19 r data store address. the source address in the egress data store. lma 6 r local memory address. the quadword target address in the shared memory pool (see 7.2.4 shared memory pool on page 193). table 7-16. rdeds output operand source name size direct indirect description ok/ko (np4gs3b (r2.0)) 1flag ko - an error occurred when reading the fourth quadword of a twin buffer indicating the link pointer contained in the data had a parity error. ok - indicates that the link pointer had valid parity or that the link pointer wasn't read on this access. lma 6 r local memory address. the lma will be the input value of the lma incremented by the number of quadword transfers com- pleted by the command.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 234 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 wrids (write ingress data store) command wrids writes to an absolute address in the ingress data store. the data store address (dsa) can be any address in the ingress data store but the picocode must ensure that no buffer overflow occurs during writing. for example, it is not a good idea to make the dsa point to the last quadword in a 64-byte buffer and set nrofquadword to four. rdids (read ingress data store) command rdids reads from an absolute address in the ingress data store. table 7-17. wrids input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be written: 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords dsa 19 r data store address. the target address in the ingress data store. the 11 lsbs of the dsa contain the address. the remaining eight msbs are not used. lma 6 r local memory address. the quadword source address in the shared memory pool (see 7.2.4 shared memory pool on page 193). table 7-18. wrids output operand source name size direct indirect description lma 6 r local memory address. the lma will be the input value of the lma incremented by the number of quadword transfers com- pleted by the command. table 7-19. rdids input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be read: 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords dsa 19 r data store address. the source address in the ingress data store lma 6 r local memory address. the quadword target address in the shared memory pool (see 7.2.4 shared memory pool on page 193).
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 235 of 554 the data store address (dsa) can be any address in the ingress data store. the picocode must ensure that no buffer overflow occurs during reading. for example, it is not a good idea to make the dsa point to the last quadword in a 64-byte buffer and set nrofquadword to four. rdmoree (read more quadword from egress) command after an egress frame dispatch, the hardware stores the first n quadwords of the frame in the datapool. rdmoree is used to read more quadwords from the egress data store. it uses three internal registers that are maintained by the data store coprocessor hardware (dispatch dsusel, current cell/twin address, and next quadword address registers) to maintain the current position in the egress data store and the shared memory pool. during a rdmoree, the hardware automatically reads the link pointer to update the current/ cell twin register when a twin boundary is crossed. the rdmoree can be executed more than once if more quadwords are required. table 7-20. rdids output operand source name size direct indirect description lma 6 r local memory address. the lma will be the input value of the lma incremented by the number of quadword transfers com- pleted by the command. table 7-21. rdmoree input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be read. 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords dscontrol 2 imm16(3..2) imm12(3..2) defines if the quadwords are read from ds0 or ds1: 00 reads from the default ds 01 reads from ds0 10 reads from ds1 11 reserved disp_dsusel 1 r indicates the default ds chosen at dispatch. 0ds0 1ds1 ccta 19 r current cell/twin address. contains the twin source address in the egress data store. this register is initalized at dispatch to point to the current twin fetched at dispatch. nqwa 3 r next quadword address. the contents of this register indicates both the target quadword address in the datapool as well as the source quadword of the twin buffer. this register is initalized at dispatch to point to the next qw after the data fetched at dis- patch.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 236 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 rdmorei (read more quadword from ingress) command after an ingress frame dispatch, the hardware stores the first n quadwords of the frame in the datapool. rdmorei is used to read more quadwords from the ingress data store. it uses two internal registers that are maintained by the data store coprocessor hardware (current cell/twin address and next quadword address registers) to maintain the current position in the ingress data store and the shared memory pool. during a rdmorei, the hardware automatically reads the bcb to update the current/cell twin register when a cell-boundary is crossed. rdmorei can be executed more than once if more quadwords are required. table 7-22. rdmoree output operand source name size direct indirect description ok/ko (np4gs3b (r2.0)) 1flag ko - an error occurred when reading the fourth quadword of a twin buffer indicating that the link pointer contained in the data had a parity error. ok - indicates that the link pointer had valid parity or that the link pointer wasn't read on this access. ccta 19 r current cell/twin address. this register is updated by hard- ware. executing rdmoree again reads the next quadwords from egress data store. nqwa 3 r next quadword address. this register is updated by hardware. executing rdmoree again causes the quadwords being read to be stored in the next locations in the datapool. table 7-23. rdmorei input operand source name size direct indirect description nrofquadword 2 imm16(1..0) gpr(1..0) defines the number of quadwords to be read. 01 1 quadword 10 2 quadwords 11 3 quadwords 00 4 quadwords ccta 19 r current cell/twin address. contains the source ingress data buffer address in the ingress data store. this register is inital- ized at dispatch to point to the next data buffer after the data fetched at dispatch. nqwa 3 r next quadword address. the contents of this register indicates both the target quadword address in the datapool as well as the source quadword of the ingress data buffer. the low two bits indicate the qw in the ingress data buffer, while all three bits are used to indicate the qw in the datapool. this register is ini- talized at dispatch to point to the next qw after the data fetched at dispatch.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 237 of 554 leasetwin command this command leases a 19-bit twin address from the egress pool of free twins. edirty (update egress dirty quadwords) command edirty writes a quadword from the datapool array to the egress data store if the quadword has been modi- fied since being loaded into the datapool by a dispatch or a rdmoree command. the data store copro- cessor maintains a register (dirty quadword) which indicates when a quadword within the datapool has been modified. the edirty command uses the dispatch dsu register to determine which egress data store (ds0, ds1, or both) must be updated. when the current cell/twin address is modified due to a rdmoree command, all dirty bits are cleared, indicating no quadwords need to be written back to the egress data store. table 7-24. rdmorei output operand source name size direct indirect description ccta 19 r current cell/twin address. this register is updated by hard- ware. executing rdmoree again reads the next quadwords from egress data store. nqwa 3 r next quadword address. this register is updated by hardware. executing rdmoree again causes the quadwords being read to be stored in the next locations in the datapool. table 7-25. leasetwin output operand source name size direct indirect description ok/ko (np4gs3b (r2.0)) 1flag ko - no twins are currently available. ok - the command was successful. ccta 19 r current cell/twin address. contains the address of the newly leased twin.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 238 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 idirty (update ingress dirty quadwords) command idirty writes a quadword from the datapool array to the ingress data store if the quadword has been modi- fied since being loaded into the datapool by a dispatch or a rdmorei command. the data store copro- cessor maintains a register (dirty quadword) which indicates when a quadword within the datapool has been modified. the idirty command uses the next quadword address to determine which cell within the datapool is represented by the current cell/twin address. when the current cell/twin address is modified due to a rdmorei command, all dirty bits are cleared, indicating no quadwords need to be written back to the ingress data store. table 7-26. edirty inputs operand source name size direct indirect description ccta 19 r current cell/twin address. contains the twin target address in the egress data store. nqwa 3 r next quadword address. the contents of this register indicates which qws are considered for edirty processing. nqwa datapool qw target twin qw 07-0 7-0 10 0 21-0 1-0 32-0 2-0 43-0 3-0 54-0 4-0 65-0 5-0 76-0 6-0 dirtyqw 8 r indicates which quadwords need to be updated dscontrol 2 imm16(3..2) imm12(3..2) defines if the quadwords are written to ds0, ds1 or both: 00 writes to the default dsu. 01 writes to ds0 10 writes to ds1 11 writes to ds0 and ds1 disp_dsu 2 r the default dsu, initialized at dispatch. table 7-27. edirty output operand source name size direct indirect description dirtyqw 8 r dirty quadwords. bits which represent the data quadwords that were updated on this command will be reset to ? 0 ? .
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 239 of 554 7.4.3 control access bus (cab) coprocessor the cab coprocessor provides interfaces to the cab arbiter and the cab for a thread. a thread must load the operands for a cab access, such as cab address and data. the protocol to access the cab is then handled by the cab interface coprocessor. 7.4.3.1 cab coprocessor address map the cab coprocessor has three scalar registers that are accessible to a thread and are shown in table 7-30 : table 7-28. idirty inputs operand source name size direct indirect description ccta 19 r current cell/twin address. the source address in the ingress data store. initially, this register is set during a dispatch. nqwa 3 r next quadword address. the contents of this register indicates which qw are considered for action by the idirty command. nqwa datapool qw target ingress data buffer qw 07-4 3-0 10 0 21-0 1-0 32-0 2-0 43-0 3-0 54 0 65-4 1-0 76-4 2-0 dirtyqw 8 r indicates which quadwords need to be updated table 7-29. idirty output operand source name size direct indirect description dirtyqw 8 r dirty quadwords. bits which represent the data quadwords that were updated on this command will be reset to ? 0 ? . table 7-30. cab coprocessor address map symbolic register name register number size (bits) access description cabstatus x ? 00 ? 3r status register bit 2 busy bit 1 0 = write access 1 = read access bit 0 arbitration granted cabdata x ? 01 ? 32 r/w data to be written to cab, or data read from cab cabaddress x ? 02 ? 32 w address used during last cab access
ibm powernp np4gs3 network processor preliminary embedded processor complex page 240 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.4.3.2 cab access to np4gs3 structures the control address bus (cab) is the np4gs3's facility for accessing internal registers. the cab is assess- able via picocode and is used for both configuration and operational functions. cab addresses consist of three fields and are defined as follows: the first field, which is comprised of the five most significant bits of the address, selects one of 32 possible functional islands within the device. the correspondence between the encoded functional island value and the functional island name is shown in the cab address, functional island encoding table below. although some functional islands have island_id values, they are not accessed via the cab. these functional islands are the ingress data store, egress data store, and control store. structures in these functional islands are accessed via the data store coprocessor and the tse. table 7-31. cab address field definitions island id structure address element address word addr 5234 32 table 7-32. cab address, functional island encoding island_id functional island name notes ? 00000 ? ingress data store 1 ? 00001 ? ingress pmm ? 00010 ? ingress eds ? 00011 ? ingress sdm ? 00100 ? embedded processor complex ? 00101 ? spm ? 00110 ? ingress flow control ? 00111 ? embedded powerpc ? 01000 ? control store 1 ? 01111 ? reserved ? 10000 ? egress data store 1 ? 10001 ? egress pmm ? 10010 ? egress eds ? 10011 ? egress sdm ? 10100 ? configuration registers ? 10101 ? dasl ? 10110 ? egress flow control ? 10111-11111 ? reserved 1. these functional islands are not accessible via the cab
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 241 of 554 the second portion of the cab address consists of the next most significant 23 bits. this address field is segmented into structure address and element address. the number of bits used for each segment can vary from functional island to functional island. some functional islands contain only a few large structures while others contain many small structures. the structure address addresses an array within the functional island while the element address addresses an element within the array. the data width of an element is variable and can exceed the 32-bit data width of the cab. the third portion of the cab address consists of a 4-bit word address for selecting 32-bit segments of the element addressed. this address is necessary for moving structure elements wider than 32-bits across the cab. 7.4.3.3 cab coprocessor commands the cab coprocessor provides the following commands: cabarb (cab arbitration) command cabarb requests to become a master on the cab interface or requests to release the cab after master status has been granted. cabarb does not cause a stall in the clp even if run synchronously. the cab coprocessor always indicates that cabarb was executed immediately, even if the arbiter did not grant the cab interface to the coprocessor. the picocode must release ownership of the cab interface when it is finished accessing the cab or a lockout condition could occur for all non-preempt accesses. command opcode description cabarb 0 cab arbitration. used by a thread to gain access to the cab. once access is granted, that thread maintains control of the cab until it releases the cab. for more information, see cabarb (cab arbitration) command on page 241 cabaccess 1 access cab. moves data onto or from the cab and the attached cab accessible registers. the source and destination within the dppu are gprs. for more information, see cabaccess command on page 242 cabpreempt 3 preempt cab. used only by the gfh thread, it enables the gfh to gain control of the cab for a single read/write access, even if the cab has already been granted to another thread. for more information, see cabpreempt command on page 242 table 7-33. cabarb input operand source name size direct indirect description start_nend 1 imm16(0) 1 start arbitration 0 release arbitration
ibm powernp np4gs3 network processor preliminary embedded processor complex page 242 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 cabaccess command performs a read or write access on the cab. before a cab access can be performed, a cabarb command must have been issued to acquire ownership of the cab interface. cabpreempt command cabpreempt has the same input and output parameters as cabaccess, except that a high-priority access to the cab is performed. no cabarb command is required before cabpreempt. if any other coprocessor is cab bus master (because it previously executed a cabarb), cabpreempt takes control of the cab bus and executes the cab read or write. after command execution, control of the cab bus returns to the previous owner. use cabpreempt with care. for example, it might be used in debug mode when the gfh is single stepping one or more other coprocessors. to give a single step command, a cab write must be executed using the cabreempt command because the coprocessor being single stepped may be executing a cabaccess command and become cab bus master. if the gfh used the cabaccess command instead of cabpre- empt, a deadlock would occur. 7.4.4 enqueue coprocessor the enqueue coprocessor manages the interface between a thread and the completion unit and manages the use of the fcbpage that is maintained in the shared memory pool. each thread has three fcbpage locations in which enqueue information about a frame may be maintained. two of the pages improve the performance of the completion unit interface when they are alternated during consecutive enqueues. the picocode written for the thread does not differentiate between these two pages because hardware manages the swap. the thread uses the third page to allow the picocode to create new frames. when a thread issues an enqueue command, the first fcbpage is marked as in-use. if the other fcbpage is available, the coprocessor is not considered ? busy ? and will not stall the clp even if the command was issued synchronously. the completion unit fetches the fcbpage from the shared memory pool through the enqueue coprocessor and provides its information to the eds (either ingress or egress as indicated by the enqueue command). the fcbpage is then marked as free. if both fcbpages are marked in use, the enqueue coprocessor is considered busy and stalls the clp if a synchronous command initiated enqueue. to guarantee that fcbpage data is not corrupted, enqueue commands must always be synchronous. table 7-34. cabaccess input operand source name size direct indirect description read_nwrite 1 imm12(0) 1 perform a cab read 0 perform a cab write address 32 gpr(31..0) the cab address cabdata 32 r load for cab write command table 7-35. cabaccess output operand source name size direct indirect description cabdata 32 r set for cab read command
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 243 of 554 note: when an enqueue command is issued and the other location of the fcbpage becomes the ? active ? page, the data is not transferred between the fcbpages and should be considered uninitialized. this is an important consideration for picocode written to handle egress multicast frames. 7.4.4.1 enqueue coprocessor address map fcbpage format the fcbpage format varies based on dispatch parameters (ingress or egress frames) and access methods. the fcbpage can be accessed as a defined field, by word, or by quadword. the set of defined fields varies according to the ingress or egress dispatch parameter. fields are mapped into locations of the shared memory pool. fields not defined as a multiple of eight bits are stored in the least significant bits (right-justi- fied) of the byte location. table 7-36. enqueue coprocessor address map name register address size access description disp_label x ? 00 ? 1r/w indicates whether a label was dispatched to the completion unit for this frame. if the nolabel parameter is not passed during an enqi or enqe command, then this bit is used to determine if the enqueue will be done with or without a label. 0 indicates that a label was not passed to the completion unit for this frame. 1 indicates that a label was passed to the completion unit for this frame. activefcbpage1 x ? fc ? 384 r/w active fcb page for the current thread. this page is initialized at dispatch. inactivefcbpage1 x ? fd ? 384 r/w inactive fcb page for the current thread. this page is not ini- tialized at dispatch. this array should never be written. fcbpage2 x ? fe ? 384 r/w alternate fcbpage
ibm powernp np4gs3 network processor preliminary embedded processor complex page 244 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 figure 7-16. ingress fcbpage format 01234567 sp (6) abort (1) gt (1) fcinfo (4) wbc (15) fcba (12) 8 9 10 11 12 13 14 15 currentbuffer (11) not used (8) not used (8) tdmu (2) l3stk/idsu (8/4) pib (6) tos (8) 16 17 18 19 20 21 22 23 tb (16) iucnmc (1) priority_sf (2) lid / mid (21 / 17) 24 25 26 27 28 29 30 31 vlanhdr (16) ins_ovlvlan (2) fhf (4) fhe (32) 32 33 34 35 36 37 38 39 not used (8) not used (8) not used (8) not used (8) not used (8) not used (8) not used (8) not used (8) 40 41 42 43 44 45 46 47 countercontrol (14) counterdata (16) counterblockindex (20) table 7-37. ingress fcbpage description (page 1 of 3) field fcb page offset initialized by dispatch unit enqueue info size (bits) description sp x ? 00 ? y n 6 source port is the port identifier where this frame was received. abort x ? 01 ? yn1 aborted frame indicates that the frame had been marked abort at the time of dispatch. gt x ? 02 ? yn1 guided traffic indicator. this bit must be set to ? 1 ? when the frame is guided traffic. fcinfo x ? 03 ? yy4 flow control color and frame drop information. see table 7-83: flow control information values on page 279. setting this field to x ? f ? disables flow control. flow control must be disabled when enqueing to the gdq, gfq, or the discard queue. wbc x ? 04 ? yn15 working byte count. this is the number of bytes available in the ingress data store for this frame at the time it was dispatched. fcba x ? 06 ? y y 11 frame control block address for the frame dispatched currentbuffer x ? 08 ? y y 11 ingress data store buffer address of the frame dispatched tdmu x ? 0c ? ny2 egress target dmu. encode is tdmu(1:0) dmu 00 a 01 b 10 c 11 d
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 245 of 554 l3stk/idsu x ? 0d ? ny8/4 l3stk (8-bit field). when iucnmc = 0 (frame is multicast), this field contains the value of the dll termination offset. the dll ter- mination offset is defined as the number of bytes starting at the beginning of the frame to the position one byte beyond the end of the data link layer. this value is based upon the encapsulation type. typically, this is the same as the start of the layer 3 protocol header, an exception would be for mpls. idsu (4-bit field). data store unit field in the frame header. indi- cates where the frame should be stored when entering the egress side. when iucnmc = 1 (frame is unicast). when set to 0, hard- ware determines the value of the data store unit field by using the value of the tdmu field and contents of the ingress tdmu data storage map register (i_tdmu_dsu) (see 13.12 ingress target dmu data storage map register (i_tdmu_dsu) on page 459). pib x ? 0e ? ny6 point in buffer. prior to enqueue, indicates location of the first byte of the frame to be sent across the switch interface tos x ? 0f ? yy8 if the frame is an ip frame these bits will be the tos field in the ip header, otherwise they will be initialized to 0 ? s. for differentiated services, the following subfields are defined: 7:5 af class. bits are used by the flow control hardware when addressing the transmit probability table. 4:2 drop precedence 1:0 cu (currently unused). tb x ? 10 ? ny16 target blade vector or target blade address. when in 16-blade mode these 16 bits are used as a target blade vector for multicast frames. in all other modes and for uc frames this is a target blade address. as a target blade vector, bit 15 corresponds to target blade address 0, that is, tb(0:15) in 64-blade mode, valid unicast target blade addresses are 0 through 63, multicast addresses are 512 to 65535. iucnmc x ? 12 ? ny1 unicast/multicast indicator 0 multicast frame 1 unicast frame priority_sf x ? 13 ? ny2 priority and special field indicators defined as: bit description 0 special field. indicates enqueues to the discard queue, gdq, or gcq. this is normally set by hardware when the qclass is used in the enqi command. 1 priority. ingress scheduler priority indicates the user pri- ority assigned to the frame. high priority is indicated by a value of 0. lid/mid x ? 14 ? ny21/17 lookup id (21-bit field). ingress picocode uses to pass information to the egress picocode. used when iucnmc = 1 (frame is uni- cast). multicast id (17-bit field). ingress picocode uses to pass informa- tion to the egress picocode. used when iucnmc = 0 (frame is multicast). vlanhdr x ? 18 ? ny16 vlan header. this is the tag control information field defined in the ieee 802.3 standard. table 7-37. ingress fcbpage description (page 2 of 3) field fcb page offset initialized by dispatch unit enqueue info size (bits) description
ibm powernp np4gs3 network processor preliminary embedded processor complex page 246 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 ins_ovlvlan x ? 1a ? ny2 insert or overlay vlan. indicates if the vlan header provided in the vlanhdr scalar register is inserted or overlays an existing vlan tag. bit description 0overlayvlan 1 insert vlan fhf x ? 1b ? ny4 frame header format. 4-bit field used by hardware and set up by picocode. hardware classifier uses this value on the egress side to determine the starting instruction address for the frame. fhe x ? 1c ? ny32 frame header extension. a 4-byte field whose contents are defined by picocode. countercontrol x ? 28 ? ny14 passed to flow control for delayed counter manager functions. bits description 13 when set to 1 enables counter operation for this enqueue. counter updates on an enqueue work only when the target is a target blade. counter updates do not occur when enqueueing to the gdq, gfq, or the discard queue. 12 add/increment 11:8 counter number 7:0 counter definition table index counterdata x ? 2a ? ny16 data passed to ingress flow control for delayed counter manager add functions counterblockindex x ? 2c ? ny20 block index passed to ingress flow control for delayed counter manager functions figure 7-17. egress fcbpage format 01234567 sb (6) eucmc (3) dsusel (1) fcinfo (4) bci (20) 8 9 10 11 12 13 14 15 currtwin (20) type (3) dsu (2) qhd (1) ow (4) 16 17 18 19 20 21 22 23 qid (20) etypeact (3) etypevalue (16) datafirsttwin (19) 24 25 26 27 28 29 30 31 saptr (10) da47_32 (16) da31_0 (32) 32 33 34 35 36 37 38 39 sainsovl (2) vlan_mpls _hwa crcaction (2) dllstake (6) ttlassist (2) not used (8) not used (8) not used (8) 40 41 42 43 44 45 46 47 countercontrol (14) counterdata (16) counterblockindex (20) table 7-37. ingress fcbpage description (page 3 of 3) field fcb page offset initialized by dispatch unit enqueue info size (bits) description
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 247 of 554 table 7-38. egress fcbpage description (page 1 of 4) field fcb page offset initialized by dispatch unit size (bits) description sb ? 00 ? y 6 source blade eucmc ? 01 ? y3 egress unicast/multicast bits 000 unicast 001 first multicast 010 middle multicast 011 last multicast 100 unicast static frame enqueue 101 first multicast static frame 110 middle multicast static frame 111 last multicast static frame dsusel ? 02 ? y1 indicates which dsu is in use by the dispatch unit and read more instruc- tions. 0ds0 1ds1 fcinfo ? 03 ? y4 flow control color information pulled from the frame header by the hardware classifier. see table 7-83: flow control information values on page 279. bci ? 04 ? y20 byte count indicator passed from a queue (gr0, gr1, gb0, gb1, gpq, gtq, gfq) to the fcbpage. indicates starting byte location within the first twin, number of data buffers, and ending byte location within the last twin. bit description 19:14 starting byte 13:6 number of data buffers (weight) 5:0 ending byte byte numbering within the data buffers starts at 0 and goes to 63. currtwin ? 08 ? y 20 at dispatch, indicates the first twin address of the frame type ? 0c ? y3 type indicates frame type and data store used. type (2:1) 00 frame 01 reserved 10 reserved 11 abort type (0) for gtq, gpq 0dsu0 1dsu1 type(0) for gr0, gb0 0dsu0 1 bothdsu0and1 type(0) gr1, gb1 0dsu1 1 both dsu0 and 1 dsu ? 0d ? y2 indicates in which dsu(s) the data for this frame is stored. value is dsu(1:0). value description 00 reserved 01 stored in dsu 0 10 stored in dsu 1 11 stored in both dsu 0 and 1
ibm powernp np4gs3 network processor preliminary embedded processor complex page 248 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 qhd ? 0e ? n1 twin header qualifier. indicates type of twin pointed to by currtwin. used for egress frame alteration. 0 the twin pointed to by the currtwin is a data twin 1 the twin pointed to by the currtwin is a header twin ow ? 0f ? n4 orphan twin weight. number of twins orphaned by frame alteration actions. when this field is greater than 0, the datafirsttwin scalar must be loaded with the location of the first data twin. when this field is 0, the etypeact and etypevalue scalar registers can be loaded for insert and overlay ethertype frame alterations. qid ? 10 ? n20 queue identifier. the format of the qid is determined by qid(19:18) as fol- lows: qid(19:18) 00 and scheduler is active, the queue is a flow qcb qid(10:0) flow qcbaddress 00 and scheduler disabled, the queue is a indicates the target port where: qid (6) priority qid (5:0) target port id 11 g queue identifier, where qid(17:0) indicates the queue as: qid (17:0) queue 000 gr0 001 gr1 010 gb0 011 gb1 100 gfq 101 gtq 110 gpq 111 discard datafirsttwin ? 14 ? n 19 address of frame ? s first data twin. valid when ow is not 0. etypeact ? 15 ? n3 the field is defined as bits(2:0) definition 000 etype value is invalid 001 etype value is valid and is used when sa/dainsert/overlay hard- ware assisted frame alteration is active to insert/overlay the etype field of the frame being processed. 010-111 reserved etypevalue ? 16 ? n 16 ethertype value used in insert / overlay egress frame alteration saptr ? 18 ? n10 source address pointer for egress frame alteration. indicates the sa array address used by the e-pmm to locate the source address for egress frame alterations. valid ranges are 0 - 63. da47_32 ? 1a ? n16 destination address bits (47:32) used for frame alteration sa/da insert or overlay actions. da31_0 ? 1c ? n32 destination address bits (31:0) used for frame alteration sa/da insert or overlay actions. table 7-38. egress fcbpage description (page 2 of 4) field fcb page offset initialized by dispatch unit size (bits) description
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 249 of 554 sainsovl ? 20 ? n2 egress frame alteration controls for overlay of sa, da, and ethertype (when etypeact is set to ? 001 ? ) bit description 1 indicates insert (field value 10) 0 indicates overlay (field value 01) both bits may not be set to 1. (11 = invalid and 00 = no action) vlan_mpls_hwa ? 21 ? n2 hardware assist for deleting vlan tags, deleting mpls labels and per- forming mpls label swap. the field is defined as follows: 00 no action 01 mpls label delete. 10 vlan tag delete 11 mpls label swap (np4gs3b (r2.0) only) the location of the vlan tag is fixed at offset 12. the location of the mpls label is determined by the value of the dll stake field. the mpls label swap function modifies the label stack entry (4 bytes) as follows: - the 20-bit label field is replaced by the contents of the da(47:28) field - the stack entry exp and s fields are unchanged. - the stack entry ttl field is decremented. crcaction ? 22 ? n2 egress frame alteration controls for modifying the crc of an ethernet frame. value description 00 no operation 01 reserved 10 append crc 11 overlay crc dllstake ? 23 ? n6 the value of the dll termination offset. the dll termination offset is defined as the number of bytes starting at the beginning of the frame to the position one byte beyond the end of the data link layer. this value is based upon the encapsulation type. typically, this is the same as the start of the layer 3 protocol header, an exception would be for mpls. ttlassist ? 24 ? n2 egress frame alteration controls for modifying the time to live field in ip headers or the next hop field in ipx headers. value is ttlassist(1:0) below: value description 00 disabled 01 ipv4, decrement ttl 10 ipx, increment hop count 11 reserved table 7-38. egress fcbpage description (page 3 of 4) field fcb page offset initialized by dispatch unit size (bits) description
ibm powernp np4gs3 network processor preliminary embedded processor complex page 250 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 fcbpage initialization during dispatch during a dispatch to a thread, the hardware classifier and the dispatch unit provide information about the frame being dispatched that is used to initialize some of the fields within the fcbpage. values initialized at thetimeofdispatchareindicatedin table 7-37 and table 7-38 . values that are not indicated as initialized at dispatch are initialized to ? 0 ? for the ? active ? fcbpage. 7.4.4.2 enqueue coprocessor commands the following commands are supported by the enqueue coprocessor: countercontrol ? 28 ? n14 passed to flow control for delayed counter manager functions bits description 13 when set to 1 enables counter operation for this enqueue. counter updates on an enqueue work only when the target is a target port or flow queue. counter updates do not occur when enqueing to the grx, gbx, gfq, gpq, gtq or e-gdq. counter updates are sup- ported for the discard port (portid = 41) and the wrap ports (portid = 40, 42) 12 add/increment 11:8 counter number 7:0 counter definition table index counterdata ? 2a ? n16 data passed to egress flow control for delayed counter manager add func- tions. counterblockindex ? 2c ? n20 block index passed to egress flow control for delayed counter manager functions command opcode description enqe 0 enqueue egress. enqueues to the egress eds via the completion unit. for more information, see enqe (enqueue egress) command on page 251. enqi 1 enqueue ingress. enqueues to the ingress eds via the completion unit. for more information, see enqi (enqueue ingress) command on page 253. enqclr 2 enqueue clear. clears (sets all fields to zero in) the specified fcbpage. for more information, see enqclr (enqueue clear) command on page 255. release_label (np4gs3b (r2.0)) 3 release label. releases the label in the completion unit for this frame. for more information, see release_label command on page 256. table 7-38. egress fcbpage description (page 4 of 4) field fcb page offset initialized by dispatch unit size (bits) description
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 251 of 554 enqe (enqueue egress) command enqe enqueues frames to the egress target queues. the qid field in the fcbpage selects the egress target queue. table 7-40 shows the coding of the qid for this selection. enqe takes a queueclass as a parameter, and some values of queueclass automatically set some bits in the qid field to a predefined value. when a frame is enqueued, the parameters from the fcbpage parameters are extracted and passed to the egress target queue. table 7-39. enqe target queues enqe command enqueues a frame in one of these egress target queues target queue description note target port/flow queues if the scheduler is disabled, enqueued frames are transmitted on target ports 0-39, and logical ports 40-42. if the scheduler is enabled, enqueued frames are enqueued into the flow queues. gr0 enqueued frames are destined for any gdh (or the gfh when it is enabled for data frame processing). a unicast frame must be stored in ds0 when queue gr0 is used. a multicast frame must always be stored in both ds0 and ds1. 1 gr1 enqueued frames are destined for any gdh (or the gfh when it is enabled for data frame processing). a unicast frame must be stored in ds1 when queue gr1 is used. a multicast frame must always be stored in both ds0 and ds1. 1 gb0 same as gr0, but treated by the dispatch unit as lower priority 1 gb1 same as gr1, but treated by the dispatch unit as lower priority 1 gfq enqueued frames in this queue are destined for the gfh 1 gtq enqueued frames in this queue are destined for the gth 1 gpq enqueued frames in the queue are destined for the embedded powerpc 1 discard queue (e-gdq) enqueued frames in this queue are discarded, which involves freeing e-ds space (twins). frames are only discarded when the mcca (multicast counter address) enqueue parameter equals zero, or when the multicast counter itself has the value 1. 1. when enqueuing to the gr0, gr1, gb0, gb1, gfq, gtq, or the gpq it is recommended to check the depth of the queue. a dead lock can occur if an attempt is made to enqueue to a full queue. when the queue is full, the enqueue should be re-directed to the discard queue, an action performed by code, and the event reported either via a count or guided traffic. table 7-40. egress target queue selection coding scheduler enabled? qid(19-18) target queue class queue address priority target queue yes 00 flow queue qid(10..0) - flow queue no 00 target port queue qid(5..0) (tpqid) qid(6) (tppri) 0-39 40 41 42 ports wrap to ingress gfq discard (dpq) wrap to ingress gdq - 11 gqueue qid(2..0) (gqid) - 000 gr0 001 gr1 010 gb0 011 gb1 100 gfq 101 gtq 110 gpq 111 discard (e-gdq)
ibm powernp np4gs3 network processor preliminary embedded processor complex page 252 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 enqe takes three parameters: the queueclass, a nolabel flag, and the fcbpage. according to the queue class parameter, bits 19, 18, and 5..0 in the qid field of the fcbpage are changed by the coprocessor according to table 7-44 . like enqi, enqe does not modify the fcbpage. table 7-41. egress target queue parameters queue type queue select enqueue parameters target port/flow queue qid qid, bci, qhd, ow, datafirsttwin, sb, currtwin, mcca, mpls_vlandel, sainsovl, crcaction, l3stake, ttlassist, dsu, saptr, da, fcinfo, countercontrol, counterdata, counterblockindex gr0, gr1, gb0, gb1, gfq, gtq, gpq qid currtwin, type, bci, mcca, countercontrol, counterdata, counterblockindex discard (e-gdq) qid currtwin, type (see table 7-42 ), bci, mcca table 7-42. type field for discard queue type definition 001 discard ds0 010 discard ds1 others reserved table 7-43. enqe command input operand source name size direct indirect description queueclass 5 imm16(4..0) gpr(4..0) egress queue class as defined in table 7-44 . nolabel 1 imm16(5) imm12(5) 0 the cu will use a label if one was dispatched with this frame. this is determined by the status of the disp_label register. 1 the completion unit will not use a label for this enqueue. this enqueue is directly executed and is not part of the frame sequence maintenance. fcbpageid 2 imm16(7..6) imm12(7..6) 00 active fcbpage is enqueued. 10 fcbpage 2 is enqueued.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 253 of 554 enqi (enqueue ingress) command enqi enqueues frames to the ingress target queues. ingress target queues are selected by means of three fields that are part of the fcbpage: iucnmc, priority_sf, and tdmu. table 7-46 shows the coding of this selection. table 7-44. egress queue class definitions queueclass qid19 qid18 qid2 qid1 qid0 target queue notes 0 ----- reserved 1 0 0 - - - port/flow queue 2 0 0 qid(5..0) = 40 wrap gfq queue 4 3 0 0 qid(5..0) = 41 discard queue (dpq) 4 0 0 qid(5..0) = 42 wrap frame queue 5 1 1 0 0 s grx queue 1 6 1 1 0 1 s gbx queue 1 7 11100 gfq 8 11101 gtq 9 11110 gpq 10 1 1 1 1 1 discard queue (gdq) 3 15 - - - - - any queue 2 1. the enqe instruction may automatically select the appropriate data store or queue, depending on the dsusel bit. 2. queue class 15 does not modify any qid bits and allows picocode full control of the target queue. 3. the type field is modified: bit 2 is set to 0 and the dsu bits are copied to bits 0 and 1 of the type field. 4. when using queue class 2, 3, or 4 and the scheduler is enabled, it is the responsibility of the picocode to insure that qcb 40, 41, and 42 are initialized for the wrap grq (40), the discard port (41), and wrap data queue (42). note: ? - ? - the field is not modified by the enqueue instruction s - the value of the dsusel bit in the fcbpage table 7-45. enqi target queues target queue description ingress multicast queue priorities 0/1 enqueued frames are treated as multicast frames. guided frames must be enqueued in a multi- cast queue, even when their destination is a single port. ingress tp queue priorities 0/1 enqueued frames are treated as unicast frames. there are a total of 512 tdmu queues: 256 high priority (priority 0) and 256 low priority. ingress discard queue priorities 0/1 enqueued frames are discarded, which involves freeing ingress buffer space (bcbs and fcbs). discarding frames on ingress consumes ingress scheduler slots, i.e., frames are discarded at 58 bytes per slot. the fcbpage's target blade field (tb) must be set to 0. ingress gdq enqueued frames are destined for any gdh (or gfh when enabled for data frame processing). ingress gfq enqueued frames are destined for the gfh.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 254 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 when a frame is enqueued the parameters from the fcbpage are extracted and passed to the ingress target queue. table 7-47 shows the parameters that are passed to the ingress eds. enqi takes three parameters: the queueclass, a nolabel flag, and a fcbpage. according to the queue- class parameter, the priority_sf and tdmu fields in the fcbpage are modified by the enqueue coprocessor according to table 7-49 when passed to the ingress eds. for example, to enqueue a unicast frame for trans- mission, the picocode prepares the fcbpage, including the priority and tp fields and invokes enqi with queueclass set to 5. table 7-46. ingress target queue selection coding iucnmc priority sf tdmu target queue 0 0 0 - ingress multicast queue - priority 0 0 1 0 - ingress multicast queue - priority 1 1 0 0 0 - 3 ingress tp queue - priority 0 1 1 0 0 - 3 ingress tp queue - priority 1 1010 ingressdiscardqueue-priority0 1110 ingressdiscardqueue-priority1 1 x 1 2 ingress gdq 1 x 1 3 ingress gfq table 7-47. ingress target queue fcbpage parameters frame type parameters uc ethernet fcba, countercontrol, counterdata, counterblockindex, tb, tdmu, fcinfo, priority_sf, iucnmc, lid, idsu, fhf, fhe, vlanhdr, pib, ins_ovlvlan mc ethernet fcba, countercontrol, counterdata, counterblockindex, tb, priority_sf, iucnmc, fcinfo, mid, l3stk, fhf, fhe, vlanhdr, pib, ins_ovlvlan table 7-48. enqi command input operand source name size direct indirect description queueclass 5 imm16(4..0) gpr(4..0) table 7-49: ingress-queue class definition on page 255 nolabel 1 imm16(5) imm12(5) 0 the cu will use a label if one was dispatched with this frame. this is determined by the status of the disp_label register. 1 the completion unit will not use a label for this enqueue. this enqueue is directly executed and is not part of the frame sequence maintenance. fcbpageid 2 imm16(7..6) imm12(7..6) 00 active fcbpage 1 is enqueued. 10 fcbpage 2 is enqueued.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 255 of 554 enqi does not modify the fcbpage. for example, if the queueclass parameter is set to txq, the sf field is set to ? 0 ? . this means that in the sf field received by the completion unit and passed to the ingress-eds is modified to ? 0 ? . the sf field in the fcbpage is not modified. enqclr (enqueue clear) command enqclr takes one parameter, the fcbpage, and fills the entire fcbpage register with zeros. table 7-49. ingress-queue class definition queue class symbolic name iucnmc priority sf tdmu target queue 0dq1 - 1 0 discard queue. picocode must set the priority field. tb field must be set to 0 by the picocode. fcinfo field must be set to x ? f ? . 2i-gdq1112gdh queue. fcinfo field must be set to x ? f ? . 4gfq1113gfh queue. fcinfo field must be set to x ? f ? . 5txq - - 0 - multicast queue or tdmu qcb queues (transmission queues). picocode must set iucnmc and tp fields. 7anyq---- any queue. picocode must set iucnmc, priority, sf and tp fields. note: a ? - ? means the fcbpage field is not modified by the enqi command when passed to the ingress-eds. table 7-50. enqclr command input operand source name size direct indirect description fcbpageid 2 imm16(7..6) imm12(7..6) indicates which fcb is to be cleared by the command. 00 active fcbpage is cleared. 10 fcbpage 2 is cleared. table 7-51. enqclr output operand source name size direct indirect description fcbpage 384 array the fcbpage array indicated by the input fcbpageid will be reset to all ? 0 ? s.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 256 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 release_label command this command (available in np4gs3b (r2.0)) releases the label in the completion unit for this frame. this command has no inputs. 7.4.5 checksum coprocessor the checksum coprocessor generates checksums using the algorithm found in the ietf network working group rfc 1071 ? computing the internet checksum ? (available at http://www.ietf.org ). as such, it performs its checksum operation on halfword data with a halfword checksum result. 7.4.5.1 checksum coprocessor address map table 7-52. release_label output operand source name size direct indirect description disp_label 1 r the disp_label register is reset on the execution of the release label command. table 7-53. checksum coprocessor address map name register (array) number access size (bits) description chksum_stat x ? 00 ? r2 status of checksum coprocessor bits description 1 insufficient indicator (if set to 1) 0 bad checksum (if set to 1) chksum_acc x ? 01 ? r/w 16 when writing this register, the value is a checksum. when reading this register, the value is a header checksum (the one ? s comple- ment of a checksum). chksum_stake x ? 02 ? r/w 10 pointer into data store coprocessor arrays where checksum is to be performed. bits description 9:8 data store coprocessor array number 7:0 byte offset into the array chksum_length x ? 03 ? r/w 8 working length remaining in the checksum calculation
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 257 of 554 7.4.5.2 checksum coprocessor commands data for the checksum coprocessor must be in one of the data store coprocessor ? s arrays. the commands to the checksum coprocessor include: when an ip is indicated, the starting location (i.e., stake) for the layer 3 header is passed. the hardware determines the length of the ip header from the header length field and loads this value into the length scalar register. when generating the checksum, a value of zero is substituted for the halfword that contains the current checksum. when cell header skip is indicated, the cell header in the egress frame is skipped in checksum operations. see the datapool for egress frames on page 227 for more details of cell header skip. command opcode description gengen 0 generate checksum. generates a checksum over a data block with a specified length. options of this command include initiating a new checksum operation or continuing a checksum where a pre- vious checksum has left off. for more information, see gengen/gengenx commands on page 258. gengenx 4 generate checksum with cell header skip. for more information, see gengen/gengenx commands on page 258. genip 1 generate ip checksum. for more information, see genip/genipx commands on page 259. genipx 5 generate ip checksum with cell header skip. for more information, see genip/genipx commands on page 259. chkgen 2 check checksum. checks a checksum over a data block with a specified length. options of this command include initiating a new checksum operation or continuing a checksum where a previous checksum has left off. for more information, see chkgen/chkgenx commands on page 259. chkgenx 6 check checksum with cell header skip. for more information, see chkgen/chkgenx commands on page 259. chkip 3 check ip checksum. for more information, see chkip/chkipx commands on page 260. chkipx 7 check ip checksum with cell header skip. for more information, see chkip/chkipx commands on page 260.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 258 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 gengen/gengenx commands table 7-54. gengen/gengenx command inputs operand source name size direct indirect description load new stake 1 imm(6) imm(6) indicates: 1 stake value is passed in the command. 0 stake value is in chksum_stake. clear accumulation 1 imm(5) imm(5) chksum_acc contains the seed for the checksum operation. the value in this register indicates whether to use or clear the data in chksum_acc. 0usedata 1 clear chksum_acc stake argument 10 imm(1:0,15:8) imm(1:0) gpr(23:16) the new stake value to be used for the checksum operation if the load new stake argument is ? 1 ? . the stake argument is comprised of two parts. the first (imm1:0) is the data store coprocessor array number. the second(imm(15:8) or gpr(23:16)) is the byte offset within the array. chksum_stake 10 r the stake value to be used for the checksum operation if the load new stake argument is ? 0 ? . chksum_acc 16 r contains the seed for the next checksum operation if the ? clear accumulation ? argument is a ? 0 ? . (otherwise there is no data in chksum_acc). chksum_length 8 r the number of halfwords to read when calculating checksum table 7-55. gengen/gengenx/genip/genipx command outputs operand source name size direct indirect description return code 1 cpei signal 0 ko, operation failed because there was not enough data in the datapool 1 ok, operation completed successfully. chksum_stat 2 r status of checksum coprocessor bits description 1 insufficient data in datapool 0 bad checksum chksum_stake 10 r stake register. points to the halfword of data following the last halfword used in the checksum command. chksum_acc 16 r the seed for the next checksum operation. contains the resulting checksum if the return code is ok.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 259 of 554 genip/genipx commands for genip/genipx command outputs, see table 7-55: gengen/gengenx/genip/genipx command outputs on page 258. chkgen/chkgenx commands table 7-56. genip/genipx command inputs operand source name size direct indirect description load new stake 1 imm(6) imm(6) indicates: 1 stake value is passed in the command. 0 stake value is in chksum_stake. clear accumulation 1 imm(5) imm(5) chksum_acc contains the seed for the checksum operation. the value in this register indicates whether or not to use or clear the data in chksum_acc. 0usedata 1 clear chksum_acc clear ip count 1 imm(4) imm(4) ip count clear control. 0 continue with current count in the checksum length register 1 clear counter and load with value in ip length field. stake argument 10 imm(1:0,15:8) imm(1:0) gpr(23:16) the new stake value to be used for the checksum operation if the load new stake argument is ? 1 ? . the stake argument is comprised of two parts. the first (imm1:0) is the data store coprocessor array number. the second(imm(15:8) or gpr(23:16)) is the byte offset within the array. chksum_stake 10 r the stake value to be used for the checksum operation if the load new stake argument is ? 0 ? . chksum_acc 16 r contains the seed for the next checksum operation if the ? clear accumulation ? argument is a ? 0 ? (otherwise there is no data in chksum_acc). table 7-57. chkgen/chkgenx command inputs operand source name size direct indirect description load new stake 1 imm(6) imm(6) indicates: 1 stake value is passed in the command. 0 stake value is in chksum_stake. stake argument 10 imm(1:0,15:8) imm(1:0) gpr(23:16) the new stake value to be used for the checksum operation if the load new stake argument is ? 1 ? . the stake argument is comprised of two parts. the first (imm1:0) is the data store coprocessor array number. the second(imm(15:8) or gpr(23:16)) is the byte offset within the array. chksum_stake 10 r the stake value to be used for the checksum operation if the load new stake argument is ? 0 ? . chksum_acc 16 r this is the checksum to be verified. chksum_length 8 r the number of halfwords to read when calculating the check- sum.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 260 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 chkip/chkipx commands for chkip/chkipx command outputs, see table 7-58: chkgen/chkgenx/chkip/chkipx command outputs on page 260. results of the commands are found in chksum_acc, chksum_stake, and the 1-bit return code to the clp. chksum_acc contains the result of the checksum calculation. chksum_stake contains the byte location following the last halfword included in the checksum. the return code indicates the operation completed successfully or was verified. the status register may need to be read to determine the status on a bad return code. table 7-58. chkgen/chkgenx/chkip/chkipx command outputs operand source name size direct indirect description return code 1 cpei signal 0 ko, operation failed. check status register for reason. 1 ok, operation completed successfully. chksum_stat 2 r status of checksum coprocessor bits description 1 insufficient data in datapool 0 bad checksum chksum_stake 10 r stake register. points to the halfword of data following the last halfword used in the checksum command. chksum_acc 16 r the seed for the next checksum operation. if equal to 0 the checksum was correct. table 7-59. chkip/chkipx command inputs operand source name size direct indirect description load new stake 1 imm(6) imm(6) indicates: 1 stake value is passed in the command. 0 stake value is in chksum_stake. clear ip count 1 imm(4) imm(4) ip count clear control. 0 continue with current count in the checksum length register. 1 clear counter and load with value in ip length field. stake argument 10 imm(1:0,15:8) imm(1:0) gpr(23:16) the new stake value to be used for the checksum operation if the load new stake argument is ? 1 ? . the stake argument is comprised of two parts. the first (imm1:0) is the data store coprocessor array number. the second(imm(15:8) or gpr(23:16)) is the byte offset within the array. chksum_stake 10 r the stake value to be used for the checksum operation if the load new stake argument is ? 0 ? . chksum_acc 16 r this is the checksum to be verified.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 261 of 554 7.4.6 string copy coprocessor the string copy coprocessor extends the dppu ? s capabilities to move blocks of data without tying up the clp. the data is moved within the shared memory pool and can start and end on any byte boundary within a defined array. 7.4.6.1 string copy coprocessor address map 7.4.6.2 string copy coprocessor commands strcopy (string copy) command strcopy moves multiple bytes of data between arrays in the shared memory pool. the clp passes the starting byte locations of the source and destination data blocks, and the number of bytes to move. table 7-60. string copy coprocessor address map name register number size (bits) access description strcpy_saddr x ? 00 ? 14 r/w the source address for the data to be copied: bit description 14 cell header skip access mode 13:10 coprocessor number 9:8 array number from coprocessor address maps 7:0 byte offset within the array strcpy_daddr x ? 01 ? 14 r/w the destination address for the data to be copied: bit description 14 cell header skip access mode 13:10 coprocessor number 9:8 array number from coprocessor address maps 7:0 byte offset within the array strcpy_bytecnt x ? 02 ? 8r the number of bytes remaining to be moved. this is a working reg- ister. once the coprocessor starts, this register will no longer be valid (it will show the number of bytes remaining). command opcode description strcopy 0 for more information, see strcopy (string copy) command on page 261 table 7-61. strcopy command input operand source name size direct indirect description strcpy_saddr 14 r source address. see table 7-60: string copy coprocessor address map on page 261 strcpy_daddr 14 r destination address. see table 7-60: string copy coprocessor address map on page 261 numbytes 8 imm(7:0) gpr(7:0) number of bytes to transfer
ibm powernp np4gs3 network processor preliminary embedded processor complex page 262 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.4.7 policy coprocessor the policy coprocessor provides an interface to the policy manager for threads. a thread requests an update to the ? color ? of a frame through this interface. frame color is part of the network processor's configurable flow control mechanism which determines what actions may be taken on the frame. a thread must wait until the policy manager, via the policy coprocessor, returns a result. 7.4.7.1 policy coprocessor address map 7.4.7.2 policy coprocessor commands polaccess (access policy manager) command polaccess requests that the policy manager accesses the policy control block for the flow that this frame is a member of. operands include the policy control block address, the length of the packet (usually the ip packet length), and the color currently assigned to the frame. the result returned is a new frame color. table 7-62. strcopy command output operand source name size direct indirect description strcpy_saddr 14 r source address. the offset field of the source address will be incremented by the number of bytes transferred. strcpy_daddr 14 r destination address. the offset field of the destination address will be incremented by the number of bytes transferred. numbytes 8 r this field is 0. table 7-63. policy coprocessor address map name register number size (bits) access description polcolor x ? 00 ? 2r/w both the color that is passed to the policy manager and the result color that is passed back from the policy manager polpktlen x ? 01 ? 16 r/w the packet length sent to the policy manager polcba x ? 02 ? 20 r a value of the policy control block address that was found in a leaf after the frame has been classified and a search was per- formed command opcode description polaccess 0 for more information, see polaccess (access policy manager) command on page 262
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 263 of 554 7.4.8 counter coprocessor the counter coprocessor provides an interface to the counter manager for threads. the counter copro- cessor has an eight-deep queue for holding counter access commands issued by any of the four threads running in each dppu. except for counter reads, the counter coprocessor will not stall the clp on synchro- nous commands unless the queue is full. this allows for one thread to have multiple counter commands outstanding simultaneously. for example one of the threads may have all the outstanding commands in the queue or each thread may have two each. normal coprocessor operation would only allow one outstanding command per thread. 7.4.8.1 counter coprocessor address map table 7-64. polaccess input operand source name size direct indirect description policy cba 20 gpr(19:0) polcba address polcolor 2 r policy color polpktlen 16 r policy packet length table 7-65. polaccess output operand source name size direct indirect description polcolor 2 r returned policy color table 7-66. counter coprocessor address map name register number size (bits) access description ctrdatalo x ? 00 ? 32 r/w counter data low. this register holds the least significant 32 bits of a counter on a read commands. on write or add commands the lower 16 bits serve as the data passed to the counter manager. (only bits 15..0 are write accessible). ctrdatahi x ? 01 ? 32 r counter data high. this register holds the most significant 32 bits of a counter on a read commands. ctrcontrol x ? 02 ? 12 r/w counter control. the bits have the following meaning: 11:8 = counter number. defines which counter in the counter-block should be updated. 7:0 = block definition index. an index in the counterdefmem.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 264 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.4.8.2 counter coprocessor commands the counter coprocessor provides the following commands: ctrinc (counter increment) command ctrinc performs an access to the central counter manager to increment a counter. this command does not cause synchronous commands to stall if the counter coprocessor queue is not full. command opcode description ctrinc 0 counter increment. initiates a modify and increment command to the counter manager. for more information, see ctrinc (counter increment) command on page 264. ctradd 1 counter add. initiates a modify and add command to the counter manager. the coprocessor passes the counter manager a 16-bit value to add to the indicated counter. for more information, see ctradd (counter add) command on page 265. ctrrd 4 counter read. initiates a read and no clear command to the counter manager. the counter manager returns the counter value to the counter coprocessor and leaves the counter unmodified. for more information, see ctrrd (counter read) / ctrrdclr (counter read clear) command on page 265. ctrrdclr 5 counter read with counter clear. initiates a read and clear command to the counter manager. the counter manager returns the counter value to the counter coprocessor and resets the counter. for more information, see ctrrd (counter read) / ctrrdclr (counter read clear) command on page 265. ctrwr15_0 6 counter write bits 15:0 of a counter. initiates a write command to the counter manager. the coprocessor passes a 16-bit value to be loaded into bits 15:0 of the counter. uses lsb of counter data low register. for more information, see ctrwr15_0 (counter write 15:0) / ctrwr31_16 (counter write 31:16) command on page 266. ctrwr31_16 7 counter write bits 31:16 of a counter. initiates a write command to the counter manager. the coprocessor passes a 16-bit value to be loaded into bits 31:16 of the counter. uses lsb of counter data low register. for more information, see ctrwr15_0 (counter write 15:0) / ctrwr31_16 (counter write 31:16) command on page 266. table 7-67. ctrinc input operand source name size direct indirect description blockindex 20 gpr(19..0) defines a counterblock within an array of counterblocks ctrcontrol 12 r(11..0) counter control. the bits have the following meaning: 11:8 = counter number. defines which counter in the counter-block should be updated. 7:0 = block definition index. an index in the counterdefmem.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 265 of 554 ctradd (counter add) command ctradd performs an access to the central counter manager to add a 16-bit value to a counter. this command will not cause synchronous commands to stall if the counter coprocessor queue is not full. ctrrd (counter read) / ctrrdclr (counter read clear) command ctrrd / ctrrdclr performs an access to the central counter manager to read a counter. the ctrrdclr command also clears the counter after the read is performed. table 7-68. ctradd input operand source name size direct indirect description blockindex 20 gpr(19..0) defines a counterblock within an array of counterblocks ctrdatalo 16 r(15..0) the value to be added to the counter. ctrcontrol 12 r(11..0) counter control. the bits have the following meaning: 11:8 = counter number. defines which counter in the counter-block should be updated. 7:0 = block definition index. an index in the counterdefmem. table 7-69. ctrrd/ctrrdclr input operand source name size direct indirect description blockindex 20 gpr(19..0) defines a counterblock within an array of counterblocks ctrcontrol 12 r(11..0) counter control. the bits have the following meaning: 11:8 = counter number. defines which counter in the counter-block should be updated. 7:0 = block definition index. an index in the counterdefmem. table 7-70. ctrrd/ctrrdclr output operand source name size direct indirect description ctrdatalo 32 r for 32-bit counters this register contains the value of the counter once the read is performed. for 64-bit counters, this register contains the least significant 32 bits of the counter once the read is performed. ctrdatahi 32 r for 32-bit counters this register is not valid. for 64-bit counters, this register contains the most significant 32 bits of the counter once the read is performed.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 266 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 ctrwr15_0 (counter write 15:0) / ctrwr31_16 (counter write 31:16) command ctrwr15_0/ctrwr31_16 performs an access to the central counter manager to write a 16-bit value to a counter. the ctrwr15_0 writes the 16-bit data value to bits 15 ..0 of the counter and the ctrwr31_16 writes the value to bits 31 ..16 of the counter. this command does not cause synchronous commands to stall if the counter coprocessor queue is not full. 7.4.9 semaphore coprocessor the semaphore manager and coprocessor are available in np4gs3b (r2.0). the semaphore coprocessor provides an interface for threads to the semaphore manager. the semaphore coprocessor supports one outstanding coprocessor command per thread, and indicates a busy status until the command has been serviced. 7.4.9.1 semaphore coprocessor commands the semaphore coprocessor provides the following commands: note: the busy signal will not be asserted if a lock no-op, unlock no-op, or reservation release no-op is issued. table 7-71. ctrwr15_0/ctrwr31_16 input operand source name size direct indirect description blockindex 20 gpr(19..0) defines a counterblock within an array of counterblocks ctrdatalo 16 r(15..0) the value to be written to the counter ctrcontrol 12 r(11..0) counter control. the bits have the following meaning: 11:8 = counter number. defines which counter in the counterblock should be updated. 7:0 = block definition index. an index in the counterdefmem. command opcode description semaphore lock 0 request to lock a semaphore. parameters include: thread semaphore number, orderid, semaphore value. this command is complete when the semaphore is locked. the minimum time to complete a lock is five cycles. this represents the amount of time busy signal is asserted: five cycles if the lock is granted immediately, more if it is blocked. semaphore unlock 1 request to unlock a semaphore. parameters include: thread semaphore number. this command is complete when the semaphore is unlocked. the time to complete an unlock is three cycles (the amount of time the busy signal is asserted). reservation release 2 request to remove a semaphore reservation from a queue. parameters include: orderid. this com- mand is complete when the reservation is released. the time to complete a release is three cycles (the amount of time the busy signal is asserted).
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 267 of 554 the following tables show the details of what is included in the commands: 7.4.9.2 error conditions these error conditions generate error interrupts on a per-thread basis: 1. exit and locked. if an exit instruction is executed and the thread still has a locked semaphore, the sema- phore manager will receive information from the dppu indicating that a thread has exited. the sema- phore manager unlocks the semaphore and then generates an error. if another thread was pending on the thread that causes the error, the pending thread might hang. table 7-72. semaphore lock input operand source name size direct indirect description semnr 1 imm16(0) imm12(0) selects one of the two semaphores that can be owned by a thread. orderid 2 imm16(3..2) imm12(3..2) 00: use unordered semaphore 01: use ordered semaphore id = 0 10: use ordered semaphore id = 1 11: no-op 4msbaddressbits 4 imm16(7..4) imm12(7..4) these four bits are ored to the upper 4 bits of address. address 32 imm16(15..8) gpr(31..0) the semaphore value (in direct mode, the 8 address bits are right-justified, the 4 msbaddress bits are ored with x ? 0 ? and left-justified, and the remaining bits are 0s) note: there are no restrictions about how orderid and semnr are used in conjunction with each other. for example, although perfectly legal, there is no requirement that a lock request wishing to use ordered semaphore id = 1 also request semnr = 1. it is acceptable to request ordered semaphore id = 1 and semnr = 0, and vice versa. table 7-73. semaphore unlock input operand source name size direct indirect description semnr 2 imm16(1..0) gpr(1..0) selects the semaphores that can be owned by a thread. 00: no-op 01: unlock semaphore semnr 0 10: unlock semaphore semnr 1 11: unlock semaphore semnr 0 and 1 table 7-74. reservation release input operand source name size direct indirect description orderid 2 imm16(3..2) gpr(1..0) 00: no-op 01: release ordered semaphore id 0 10: release ordered semaphore id 1 11: release ordered semaphore id 0 and 1 note: when using the reservation release instruction, the thread must wait for completion of this instruction before exiting. the res- ervation release command must be issued synchronously, or it must be issued asynchronously followed by a wait command to the semaphore coprocessor.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 268 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 if a thread issues an asynchronous lock request that goes pending and then exits before the request is completed, this error will be reported via the semaphore errors register. this might cause this thread to hang during a subsequent dispatch, or cause another thread requiring the requested semaphore to hang. 2. lock same sem value. if a thread tries to lock a second semaphore with the same value as the first one it already has locked, the semaphore manager will not lock the semaphore and an error will be reported via the semaphore errors register. 3. lock same sem number. if a thread tries to lock a semaphore number (0 or 1) that it already has locked (regardless of semaphore value), the semaphore manager will not grant the lock. it will unlock the sema- phore that was in that position and will generate an error. if another thread was pending on the thread that causes the error, the pending thread might hang. 4. queue not enabled. if a thread tries to lock an ordered semaphore, but the orderenable[threadnum] bit is not set, an error is generated. the semaphore manager will grant the lock anyway. the lock request will internally be made to look like an unordered lock request and will be processed as though it were an unordered lock request originally. 5. label released and reservation unused. when an enqueue with label or a release label instruction is executed and the thread still has an entry pending on the ordered semaphore queues id = 0 or id = 1 which it has not used, an error will be reported via the semaphore errors register. there is a precedence with regard to multiple errors present in one command (only errors 2, 3, and 4 described above can happen all in one request). the error precedence order for errors 2, 3, and 4 is as follows: 1. error 3 (lock same sem number) will be detected and will prevent errors 2 and 4 from being detected. 2. error 4 (queue not enabled) and error 2 (lock same sem val) will both be detected if error 3 is not present. for example, a thread in its initial state might have no orderid queues enabled and semnr 0 locked with a semaphore value x. if that thread sends in a request for an ordered lock of semnr 0 and semaphore value x, it results in error conditions 2, 3, and 4. however, the semaphore manager will detect one of the errors first (error 3), generate the error interrupt and complete the request before detecting any of the other errors. on the other hand, assuming the same initial state as above, if a thread sends in a request for an ordered lock of semnr 1 and semaphore value x, which results in errors 3 and 4, both errors will be detected and reported. 7.4.9.3 software use models a few rules should be followed when writing software that will use semaphores in order to prevent software lockups due to improper use of semaphores. these rules apply when the software is attempting to have two semaphores locked at the same time. 1. semnr 0 and semnr 1 should always be different ? families ? of semaphore values. in other words, a semaphore value locked in position semnr 0 should never be locked in position semnr 1. one way to accomplish this is to use the four most significant address bits to represent resources that are allowed to belockedinoneofthetwo ? families ? of semaphore values. 2. semnr 0 should always be locked first, followed by semnr 1. 3. ordered semaphore operations (ordered lock requests or reservation releases) must be completed before the first enqueue and are not permitted after that.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 269 of 554 7.5 interrupts and timers the np4gs3 provides a set of hardware and software interrupts and timers for the management, control, and debug of the processor. when an interrupt event occurs or a timer expires a task is scheduled to be processed by the embedded processor complex. the interrupt or timer task does not preempt threads currently processing other tasks, but is scheduled to be dispatched to the next idle thread that is enabled to process the interrupt or timer. the starting address of the task is determined by the interrupt or timer class and read from table 7-78: port configuration memory content on page 274. 7.5.1 interrupts the interrupt mechanism within the np4gs3 has several registers: the interrupt vector registers 0-3, inter- rupt mask register 0-3, interrupt target register 0-3. and the software interrupt registers. 7.5.1.1 interrupt vector registers the interrupt vector register is a collection of interrupts that will share a common code entry point upon dispatch to a thread in the epc. a bit representing the individual interrupt within the interrupt vector register is only set on the initial event of the interrupt. even if the interrupt vector register is cleared, an outstanding interrupt will not set the bit in the register again until it is detected that the interrupt condition is going from an inactive to active state. picocode can access the interrupt vector registers either through the dashboard or through the master copy. the interrupt vector register is cleared when it is read from its master copy address. when read from the dashboard, the register is not cleared. 7.5.1.2 interrupt mask registers the interrupt mask registers have a bit to correspond with each bit in the interrupt vector registers. the interrupt mask registers indicate which interrupts in the interrupt vector registers cause an task to be scheduled for processing by the epc. the interrupt mask registers have no affect on the setting of the inter- rupt vector registers. 7.5.1.3 interrupt target registers the interrupt target register indicates which thread types in the epc are enabled to process the interrupt of a given class 0-3. 7.5.1.4 software interrupt registers the software interrupt registers provide 12 unique interrupts (three in each of the four classes) that can be defined by software and accessed through cab addresses. writing a software interrupt register sets the corresponding bit within the interrupt vector register 0-3 and has the same effect as a hardware-defined interrupt. 7.5.2 timers the np4gs3 has four timer interrupt counters that can be used to generate delayed interrupts to the epc.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 270 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.5.2.1 timer interrupt counters timer interrupt counters 0-2 are 24 bits in length and decrement at 1 ms intervals. timer interrupt counter 3 is 32 bits in length and decrements at 10 s intervals. the timer interrupt counters are activated by writing a non-zero value to the register. when the timer decrements to a zero value it will schedule a timer task to be processed by the epc
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 271 of 554 7.6 dispatch unit the dispatch unit tracks thread usage and fetches initial data of frames prior to assigning a thread to a task. it also handles timers and interrupts by dispatching the work for these to an available thread. the dispatch unit fetches frame data from the ingress and egress data stores for frame traffic work. the data is placed into the dispatch data buffer (ddb), an array maintained by the dispatch unit. there are 12 entries in the array in np4gs3a (r1.1) and 24 entries in np4gs3b (r2.0). each entry holds data for one frame dispatch. three of the locations are fixed in use. the remaining nine (np4gs3a (r1.1)) or 12 entries (np4gs3b (r2.0)) are used for work from the remaining egress frame queues, gr0, gr1, gb0, and gb1, and the ingress gdq queues. the three fixed locations hold data for frame traffic from the ingress and egress guided frame handler (gfq) queues, the egress general table handler (gtq) queue, and the request and egress only general powerpc handler (gpq) queues. the dispatch unit selects work from interrupts, timers, and frame queues. there are nine frame queues, four timer interrupts, and four hardware interrupts. if a queue is not empty and there is room in the ddb for data for a queue type, then the dispatch unit's queue arbiter considers the queue a candidate for selection via a priority, ingress/egress driven, round-robin process. see table 7-75: priority assignments for the dispatch unit queue arbiter on page 271 for the queues and priority weights. the lower the priority number, the higher the priority selection weight. figure 7-18. dispatch unit gfq dispatch event controller thread status dispatch interface data and port configuration memory information port configuration memory ingress data store mover ingress data store interface to gfq gtq gpq ddb dcb data move status egress data store mover egress data store interface to egress eds data move status queue arbiter gdq i-gfq gtq e-gfq gpq gr0 gr1 gb0 gb1 gdq egress queues ingress queues ingress eds
ibm powernp np4gs3 network processor preliminary embedded processor complex page 272 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 selection toggles between the ingress groups and egress groups. if there are no candidates in one group, then the other group may be selected during consecutive opportunities. in order to keep either the ingress or the egress work from taking up the entire ddb and thus starving the opposite group, a threshold is main- tained for both. these thresholds are compared against the nine entries (np4gs3a (r1.1)) or 12 entries (np4gs3b (r2.0)) in the ddb from the ingress and egress work groups. if a threshold is exceeded, then no further work for that group is allowed into the ddb. simulation has shown that a value of 6 (np4gs3a (r1.1)) or 16 (np4gs3b (r2.0)) for each group's threshold maximizes the dispatch unit throughput and does not allow either group to starve from lack of processor resources. once a frame has been selected by the queue arbiter, that frame ? s data is fetched into the ddb. any relevant information about the frame ? s queue entry is fetched into the dispatch control block (dcb). the amount of data fetched is dependent on the queue ? s port configuration memory contents. the port configuration memory is an array used by the dispatch unit that contains entries for all ingress ports, interrupts, and egress work queues. the dispatch event controller (dec) schedules the dispatch of work to a thread. it monitors the status of all 32 threads in the epc, the status of data movement into the ddb, and the status of the four hardware inter- rupts and four timers. work from the gfq, gtq, and gpq may only be processed by the gfh, gth, and gph-req threads respectively. threads that are allowed to handle interrupts and timers are configurable; they can be restricted to a single thread type, or can be processed by any thread. the dec assures that there is a match between an available thread and the type of work to be performed. the dec load balances threads on the available dppus and clps, keeping the maximum number of threads running in parallel in the epc. when a thread is available for work, and there is a match between the thread and ready work unit, and all the data has been moved for the work unit, then the work unit is dispatched to the thread for processing. the dec provides the data fetched from the ddb and the contents of the port configuration memory corre- sponding to the entry. when multiple units of work are ready to go that match available threads, the dec toggles between ingress and egress work units. there is no data movement for timers and interrupts. only the contents of the port configuration memory entry are passed to the thread. table 7-75. priority assignments for the dispatch unit queue arbiter queue priority ingress group i-gfq 1 gdq 2 egress group e-gfq 1 gtq 2 gpq 3 gr0 4 gr1 4 gb0 5 gb1 5
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 273 of 554 7.6.1 port configuration memory table 7-78: port configuration memory content on page 274 provides control, default, and software defined information on each dispatch and is passed to the thread to be stored in the configuration quadword array in the data store coprocessor. 7.6.1.1 port configuration memory index definition the port configuration memory index consists of 64 entries that are indexed based upon multiple parameters including ingress port, egress queues, timers, and interrupts as defined in table 7-76 . table 7-76. port configuration memory index port configuration memory index definition 0 .. 39 ingress sp (from gdq) 40 ingress wrap frame (i-gdq) 41 reserved 42 i-gfq 43 ingress wrap guided 44 reserved 45 gpq 46 egress gfq 47 gtq 48 egress frame in either ds0 or ds1 49 egress frame in ds0 and ds1 50 reserved 51 reserved 52 reserved 53 reserved 54 egress abort of frame in either ds0 or ds1 55 egress abort of frame in ds0 and ds1 56 interrupt 0 57 interrupt 1 58 interrupt 2 59 interrupt 3 60 timer 0 61 timer 1 62 timer 2 63 timer 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 274 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.6.2 port configuration memory contents definition bits 127.. 24 are software defined and are for use by the picocode. the remaining bits are used by the hard- ware, as defined in table 7-78 . table 7-77. relationship between sp field, queue, and port configuration memory index sp field in fcb1 queue port configuration memory index 0 .. 39 (denotes physical port) gdq 0 .. 39 0 .. 39 (denotes physical port) i-gfq 42 40 (denotes wrap port) gdq 40 40 (denotes wrap port) i-gfq 43 table 7-78. port configuration memory content field name bits description 127 .. 24 for software use pos_ac 23 0 ac not present 1 ac present ethernet/ppp 22 0 ppp port 1 ethernet port codeentrypoint 21.. 6 the default code entry point. can be overwritten by the hardware classifier (if enabled). culabgenenabled 5 0 no label is generated 1 the hardware classifier generates a label that is used by the comple- tion unit to maintain frame sequence this field must be set to 0 for port configuration memory entries 54, 55 (repre- senting aborted frames on the egress), 56-59 (interrupts), and 60-63 (timers). hardwareassistenabled 4 0 hardware classifier is disabled 1 hardware classifier is enabled and the classifier may overwrite the codeentrypoint reserved 3 .. 2 reserved. set to ? 00 ? . numberofquadwords 1.. 0 defines the number of quadwords that the dispatch unit will read from the frame and store in the datapool. a value of ? 00 ? represents four quadwords. for port configuration memory entries 56-63 (interrupts and timers), this field is ignored and the dispatch unit will not write any quadword into the datapool.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 275 of 554 7.7 hardware classifier the hardware classifier provides hardware assisted parsing of the ingress and egress frame data that is dispatched to a thread. the results are used to precondition the state of a thread by initializing the thread ? s general purpose and coprocessor scalar registers along with the registers ? resources and a starting instruc- tion address for the clp. parsing results indicate the type of layer 2 encapsulation, as well as some informa- tion about the layer 3 frame. recognizable layer 2 encapsulations include ppp, 802.3, dix v2, llc, snap header, and vlan tagging. reported layer 3 information includes ip and ipx network protocols, five programmable network protocols, the detection of option fields, and ip transport protocols (udp and tcp). if enabled, the hardware classifier also generates labels that are passed to the completion unit with a thread identifier. the cu uses them to maintain frame order within a flow. 7.7.1 ingress classification ingress classification, the parsing of frame data which originated in the ingress eds and is now being passed from the dispatch unit to the dppu, can be applied to ethernet/802.3 frames with the following layer 2 encapsulation: dix v2, 802.3 llc, snap header, and vlan tagging. classification can also be done on pos frames using the point-to-point protocol with or without the ac field present (pos_ac) field. 7.7.1.1 ingress classification input the hardware classifier needs the following information to classify an ingress frame: port configuration memory table entry ( ta b l e 7 - 7 8 on page 274). the hardware classifier uses the fol- lowing fields: - codeentrypoint - the default starting instruction address. - hardwareassistenabled - culabgenenabled - ethernet/ppp -posac frame data. four quadwords must be dispatched (per frame) for ingress classification. protocol identifiers: the hardware classifier compares the values in the protocol identifier registers with the values of the fields in the frame that correspond to those identifiers to determine if one of the config- ured protocols is encapsulated in the frame. the hardware classifier supports seven ethernet protocols (five of which are configurable) and eight point-to-point protocols (five of which are configurable). two registers need to be configured for each ethernet protocol, the ethernet type value and an 802.2 service access point value. the protocol identifiers are configured from the cab.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 276 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.7.1.2 ingress classification output the outputs of the ingress hardware classification are: the starting instruction address that is stored in the hccia table. this address is only used when the hardware classification is enabled. when the hardware classification is disabled, or if a protocol match is not found, the code entry point from the port configuration memory table ( table 7-78: port configuration memory content on page 274) is used as the starting instruction address and is passed to the thread. if a protocol match does occur, the starting instruction address is retrieved from the last 24 entries of the hccia table. hccia values are configured from the cab. the address into hccia are provided in ta b l e 7-80: hccia table on page 277. table 7-79. protocol identifiers cab address access bits description x ? 2500 0000 ? r/w 16 ethernet type for protocol 0 x ? 2500 0010 ? r/w 16 ethernet type for protocol 1 x ? 2500 0020 ? r/w 16 ethernet type for protocol 2 x ? 2500 0030 ? r/w 16 ethernet type for protocol 3 x ? 2500 0040 ? r/w 16 ethernet type for protocol 4 x ? 2500 0050 ? r 16 ethernet type for ipx (x ? 8137 ? ) x ? 2500 0060 ? r 16 ethernet type for ip (x ? 0800 ? ) x ? 2500 0070 ? 16 reserved x ? 2500 0080 ? r/w 16 point to point type for protocol 0 x ? 2500 0090 ? r/w 16 point to point type for protocol 1 x ? 2500 00a0 ? r/w 16 point to point type for protocol 2 x ? 2500 00b0 ? r/w 16 point to point type for protocol 3 x ? 2500 00c0 ? r/w 16 point to point type for protocol 4 x ? 2500 00d0 ? r 16 point to point type for ipx (x002b ? ) x ? 2500 00e0 ? r 16 point to point type for ip (x0021 ? ) x ? 2500 00f0 ? r 16 point to point control type frame (msbs of type field is ? 1 ? ) x ? 2500 0100 ? r/w 8 service access point for protocol 0 x ? 2500 0110 ? r/w 8 service access point type for protocol 1 x ? 2500 0120 ? r/w 8 service access point type for protocol 2 x ? 2500 0130 ? r/w 8 service access point type for protocol 3 x ? 2500 0140 ? r/w 8 service access point type for protocol 4 x ? 2500 0150 ? r 8 service access point type for ipx (x ? e0 ? ) x ? 2500 0160 ? r 8 service access point type for ip (x ? 06 ? ) x ? 2500 0170 ? 8 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 277 of 554 table 7-80. hccia table hccia cab address bits classification x ? 2500 0400 ? x ? 2500 05f0 ? 16 egress locations (see section 7.7.2.1 egress classification input on page 279) x ? 2500 0600 ? 16 ethernet protocol 0 classification with no 802.1q vlan x ? 2500 0610 ? 16 ethernet protocol 1 classification with no 802.1q vlan x ? 2500 0620 ? 16 ethernet protocol 2 classification with no 802.1q vlan x ? 2500 0630 ? 16 ethernet protocol 3 classification with no 802.1q vlan x ? 2500 0640 ? 16 ethernet protocol 4 classification with no 802.1q vlan x ? 2500 0650 ? 16 ethernet ipx classification with no 802.1q vlan x ? 2500 0660 ? 16 ethernet ip classification with no 802.1q vlan x ? 2500 0670 ? 16 ingress aborted frame at time of dispatch x ? 2500 0680 ? 16 ethernet protocol 0 classification with 802.1q vlan x ? 2500 0690 ? 16 ethernet protocol 1 classification with 802.1q vlan x ? 2500 06a0 ? 16 ethernet protocol 2 classification with 802.1q vlan x ? 2500 06b0 ? 16 ethernet protocol 3 classification with 802.1q vlan x ? 2500 06c0 ? 16 ethernet protocol 4 classification with 802.1q vlan x ? 2500 06d0 ? 16 ethernet ipx classification with 802.1q vlan x ? 2500 06e0 ? 16 ethernet ip classification with 802.1q vlan x ? 2500 06f0 ? 16 ethernet vlan frame with an erif x ? 2500 0700 ? 16 point to point protocol 0 classification x ? 2500 0710 ? 16 point to point protocol 1 classification x ? 2500 0720 ? 16 point to point protocol 2 classification x ? 2500 0730 ? 16 point to point protocol 3 classification x ? 2500 0740 ? 16 point to point protocol 4 classification x ? 2500 0750 ? 16 point to point ipx classification x ? 2500 0760 ? 16 point to point ip classification x ? 2500 0770 ? 16 point to point control frame table 7-81. protocol identifiers for frame encapsulation types frame encapsulation type data store coprocessor register setting snap ethernet type dix v2 ethernet type sap dsap/ssap point to point protocol point to point type
ibm powernp np4gs3 network processor preliminary embedded processor complex page 278 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 ingress protocol type register contains the output of the ingress hardware classifier. this data store coprocessor register is loaded with the protocol identifier that identifies the frame for the given encapsulation type. the fields in the frame data that correspond to data store coprocessor register settings (see table 7- 81 ) are passed to the ingress protocol type register. if a vlan tagged frame with an e-rif field is present within the frame, or if the hardware classifier is disabled, this field is invalid. dll termination offset if the hardware classifier is enabled, the dll termination offset is loaded into gpr r0. the dll termina- tion offset is defined as the number of bytes, starting at the beginning of the frame, to the position one byte beyond the end of the data link layer. this value is based upon the encapsulation type. typically, this is the same as the start of the layer 3 protocol header, an exception would be for mpls. classification flags: if the hardware classifier is enabled, classification flags will be loaded into the thread ? sgprr1. table 7-82. general purpose register bit definitions for ingress classification flags bit definition bit 15 protocol 7 detected bit 14 protocol 6 detected bit 13 protocol 5 detected bit 12 protocol 4 detected bit 11 protocol 3 detected bit 10 protocol 2 detected bit 9 protocol 1 detected bit 8 protocol 0 detected bit 7 ? 0 ? bit 6 indicates ip options field present bit 5 dix encapsulation bit 4 indicates sap encapsulation bit 3 indicates snap encapsulation bit 2 indicates 802.1q vlan frame bit 1 indicates the 802.1q vlan id was non-zero bit 0 indicates an erif present
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 279 of 554 flow control information the hardware classifier initializes the fcbpage ? s flow control information (fcinfo) field. the field ? s value is based on ip frame information that includes the frame color indicated by bits 4:3 of the ip header ? s tos field and the tcp header ? s syn bit. the hardware classifier never sets fcinfo field to ? 1111 ? but once the field is in the fcbpage it can be written by picocode to be a ? 1111 ? . 7.7.2 egress classification egress classification, the parsing of frame that originated in the egress eds and is being transferred from the dispatcher to the dppu is limited to choosing a starting instruction address and generating a label to pass to the completion unit. 7.7.2.1 egress classification input for egress frames, the hardware classifier needs the following information to classify the frame: port configuration memory table entry ( ta b l e 7 - 7 8 on page 274). the hardware classifier uses the fol- lowing fields: - codeentrypoint- the default starting instruction address - hardwareassistenabled - culabgenenabled frame data. one quadword must be dispatched (per frame) for egress classification. table 7-83. flow control information values fcinfo definition 0000 tcp - green 0001 tcp - yellow 0010 tcp - red 0011 non-ip 0100 udp - green 0101 udp - yellow 0110 udp - red 0111 reserved 1000 tcpsyn- green 1001 tcpsyn - yellow 1010 tcpsyn - red 1011 reserved 1100 other ip - green 1101 other ip - yellow 1110 other ip - red 1111 disable flow control
ibm powernp np4gs3 network processor preliminary embedded processor complex page 280 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.7.2.2 egress classification output the outputs of the egress hardware classification are: starting instruction address. the starting instruction address stored in the hccia memory is only used when the hardware classifica- tion is enabled. when the hardware classification is disabled, the code entry point from the port config- uration memory table ( table 7-78: port configuration memory content on page 274) is used as the starting instruction address and is passed to the thread. the address into hccia is given by the uc field and by the fhf field in the frame header: frame data offset gpr r0 is always (i.e. also when the hardware classification is disabled) set according to table 7-10: egress frames datapool quadword addresses on page 228 classification flags (np4gs3b (r2.0)): if the hardware classifier is enabled, classification flags will be loaded into the thread ? sgprr1. table 7-84. hccia index definition 14 uc fhf table 7-85. general purpose register 1 bit definitions for egress classification flags bit definition bit 15:1 reserved bit 0 a link pointer error was found during dispatch unit access of the egress data store. recommended action is to discard the frame.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 281 of 554 7.8policymanager the policy manager is a hardware assist of the embedded processor complex that performs policy manage- ment on up to 1 k ingress flows. it supports four management algorithms. one algorithm pair is "single rate three color marker," operated in color-blind or color aware mode. the other is "two rate three color marker," operated again in color-blind or color-aware mode. the algorithms are specified in ietf rfcs 2697 and 2698. the algorithms are specified in ietf rfcs 2697 and 2698 (available at http://www.ietf.org ). the policy manager maintains up to 1024 leaky bucket meters with selectable parameters and algorithms. the picocode sends the policy manager a policy manager control block address (polcba), a color, and a packet length for each incoming packet. according to the policy management control block (polcb), two token counters stored in internal memory are regularly incremented (subject to an upper limit) by a rate spec- ified in the polcb and, when a packet arrives, are decremented by one of four possible algorithms (depending upon the incoming packet length). after both actions are complete, the token counters generally have new values and a new color is returned to the picocode. in addition there are three 10-bit wide packet counters in the polcb (redcnt, yellowcnt, and greencnt) that use the output packet colors to count the number of bytes (with a resolution of 64 bytes) of each color. when these counters overflow, the policy manager invokes the counter manager with an increment instruction and counters maintained by the counter manager are used for the overflow count. counter definition 0 is reserved for the policy manager, and must be configured for these overflow counts. the polcb must be configured before use. configuration is accomplished via the cab. the contents of the polcb are illustrated in table 7-86: polcb field definitions on page 282. classification of a frame by the picocode must result in a polcba. the hardware classifier provides the color of the frame in the fcbpage. the frame length used must be parsed from the ip packet header by the pico- code. the policy manager receives the following inputs from the picocode: polcba (20 bits) figure 7-19. split between picocode and hardware for the policy manager polcb memory policy manager to counter manager cab access polcba (20) color (2) packetlength (16) newcolor (2) picocode hardware
ibm powernp np4gs3 network processor preliminary embedded processor complex page 282 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 color (2 bits), the color of the incoming packet. these are re-encoded as received in the ds byte as 00 = green, 01 = yellow, 10 = red. note that the encoding used does not conform to the rfc 2597. packet length (in bytes, 16 bits) the policy manager reads the polcb from the memory, executes the algorithm configured by the type field in the polcb, writes the updated polcb back to the memory and returns the new color to the picocode. the policy manager can perform these operations once every 15 core clock cycles. picocode might use this infor- mation as follows: perform no special action discard the packet change the ds byte (i.e., (re-)mark the frame) table 7-86. polcb field definitions (page 1 of 2) field size description type 4 algorithm type. this field must be initialized by picocode. 0000 color blind single rate three color marker 0001 color aware single rate three color marker 0010 color blind two rate three color marker 0011 color aware two rate three color marker 0100-1111 reserved pa_time 32 previous arrival time in ticks. this field must be initialized to 0 by the picocode. pa_time is compared to a running 32-bit counter which is incremented every 165/150 ns. selection of the accuracy of the counter tick is controlled by the setting of the dram parameter regis- ter bit 22 (11/10 ). when set to 1 the tick rate is 165 ns when set to 0 the tick rate is 150 ns. c_token 26 token counter for committed rate accounting in bytes. this field must be initialized by pic- ocode to contain the same value as c_burstsize. (format is 17.9) ep_token 26 token counter for excess or peak rate accounting in bytes. this field must be initialized by picocode to contain the same value as ep_burstsize. (format is 17.9) cir 12 committed information rate in bytes/tick used for two rate algorithms. cir uses an expo- nential notation x * 8 y-3 where 11:3 x 2:0 y; valid values of y are ? 000 ? through ? 011 ? a tick for the policy manager is defined to be either 165 or 150 ns. selection of the value for a tick is controlled by the setting of the dram parameter register bit 22 (11/10 ). when set to 1 the tick is 165 ns, when set to 0 the tick is 150 ns. this field must be initialized by picocode to meet the service level agreement. the pir must be equal to or greater than the cir. cir can be defined in the range of 100 kbps through 3 gbps. pir 12 peak information rate in bytes/tick used for two rate algorithms. pir uses an exponential notation x * 8 y-3 where 11:3 x 2:0 y; valid values of y are ? 000 ? through ? 011 ? a tick for the policy manager is defined to be either 165 or 150 ns. selection of the value for a tick is controlled by the setting of the dram parameter register bit 22 (11/10 ). when set to 1 the tick is 165 ns, when set to 0 the tick is 150 ns. this field must be initialized by picocode to meet the service level agreement. the pir must be equal to or greater than the cir. pir can be defined in the range of 100 kbps through 3 gbps.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 283 of 554 c_burstsize 17 committed burst size in bytes. this field must be initialized by picocode to meet the service level agreement. for the single rate algorithms, either the c_burstsize or the ep_burstsize must be larger than 0. it is recommended that when the value of c_burstsize or the ep_burstsize is larger than 0, it is larger than or equal to the size of the mtu for that stream. for the two rate algorithms, c_burstsize must be greater than 0. it is recommended that it is larger than or equal to the size of the mtu for that stream. ep_burstsize 17 excess or peak burst size in bytes. definition depends on algorithm selected. this field must be initialized by picocode to meet the service level agreement. for the single rate algorithms, either the c_burstsize or the ep_burstsize must be larger than 0. it is recommended that when the value of c_burstsize or the ep_burstsize is larger than 0, it is larger than or equal to the size of the mtu for that stream. for the two rate algorithms, ep_burstsize must be greater than 0. it is recommended that it is larger than or equal to the size of the mtu for that stream. greencnt 10 number of bytes (with 64-byte resolution) in packets flagged as ? green ? by the policy man- ager. when this counter overflows, the policy manager uses the counter manager interface to increment an extended range counter. this field must be initialized to 0 by picocode. the counter control block for the policy man- ager must be configured at counter definition table entry 0. the green count is counter number 0. yellowcnt 10 number of bytes (with 64-byte resolution) in packets flagged as ? yellow ? by the policy man- ager. when this counter overflows, the policy manager uses the counter manager interface to increment an extended range counter. this field must be initialized to 0 by picocode. the counter control block for the policy man- ager must be configured at counter definition table entry 0. the yellow count is counter number 1. redcnt 10 number of bytes (with 64-byte resolution) in packets flagged as ? red ? by the policy man- ager. when this counter overflows, the policy manager uses the counter manager interface to increment an extended range counter. this field must be initialized to 0 by picocode. the counter control block for the policy man- ager must be configured at counter definition table entry 0. the red count is counter number 2. table 7-86. polcb field definitions (page 2 of 2) field size description
ibm powernp np4gs3 network processor preliminary embedded processor complex page 284 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.9 counter manager the counter manager is a hardware assist engine used by the epc to control various counts used by the picocode for statistics, flow control, and policy management. the counter manager is responsible for counter updates, reads, clears, and writes, and it allows the picocode to access these functions using single instruc- tions. the counter manager arbitrates between all requestors and acknowledges when the requested opera- tion is completed. the counter manager works in concert with the counter coprocessor logic to allow the picocode access to the various counters. the counter manager supports the following: 64-bit counters 32-bit counters 24/40-bit counters read, read/clear, write, increment, and add functions a maximum of 1 k 64-bit, 2 k 32-bit, or some mix of the two not to exceed a total size of 64 kb, of fast internal counters up to 4 m 64-bit or 32-bit external counters. two 32-bit counters are packed into a 64-bit dram line. selection of the counter is accomplished by the low-order address bit. counter definition table for defining counter groups and storage locations interfaces to all eight dppus, the policy manager, ingress and egress flow control five independent counter storage locations
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 285 of 554 figure 7-20. counter manager block diagram ingress flow control internal counter dram bank cntrq4 cntrq0 cntrq1 address & update value counter definition table multiplexer / arbitration dppu7 counter coprocessor dppu0 cntrq2 cntrq3 read read read read read ... policy external dram - d2 a dram bank b dram bank c dram bank d manager counter manager counter coprocessor memory egress flow control
ibm powernp np4gs3 network processor preliminary embedded processor complex page 286 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.9.1 counter manager usage the counter manager manages various counters for the epc. the epc, policy manager, and the ingress and egress flow control have the ability to update these counters. the picocode accesses these counters for statistics, flow control actions, and policy decisions. before a counter can be used, its counter definition entry must be configured. this entry defines where the counter is stored, how many counters are associated with this set, and the counter ? s size. the counter manager supports 256 different counter definition entries. the counter definition entry is written to the counter definition table using the cab. the entry format in shown in table 7-90 . table 7-87. counter manager components component name description counter definition table contains definitions for each counter block (256 entries). a counter block is made up of the counter ? s memory storage location (bank a, bank b, bank c, bank d, or internal), the base address within that memory, the number of counters within the set, and the counter size. multiplexing and arbitration multiplexing and arbitration logic selects the next counter action from all requestors according to prior- ity. flow control has the highest priority, policy manager the next, and the set of dppus the lowest pri- ority. multiplexing and arbitration logic uses two work conserving round-robins: one between the two flow control requestors and one for the set of dppus. this logic returns the read data to the appropri- ate requestor during counter reads. address and update value address and update value logic uses information gathered from the counter definition table and the parameters passed from the requestor to create the final counter address and update value. cntrq0 - cntrq4 five counter queues used to temporarily hold the counter request and allow the counter manager to access all five memory locations independently. the request is placed into the appropriate queue based on the memory storage location information found in the counter definition table. read read logic gathers the read data from the appropriate memory location and returns it to the requestor. internal counter memory internal counter memory holds the ? fast ? internal counters and can be configured to hold 64-bit, 24/40- bit, or 32-bit counters. the internal counter memory size is 1024 locations x 64 bits. table 7-88. counter types type name description 64-bit counter counter that holds up to a 64-bit value. 24/40-bit counter special 64-bit counter that has a standard 40-bit portion and a special 24-bit increment portion. the 40- bit portion is acted upon by the command passed and the 24-bit portion is incremented. this counter allows ? byte count ? and ? frame count ? counters to be in one location; the 40-bit portion is the byte count, and the 24-bit portion is the frame count. 32-bit counter counter that holds up to a 32-bit value. table 7-89. counter actions action name description read read counter and return value (either 64 or 32 bits) to the epc. read/clear read counter and return value (either 64 or 32 bits) to the epc. hardware then writes counter to zero. write write 16 bits to the counter (either bits 31:16 or 15:0) and zero all other bits. increment add one to the counter value and store updated value in memory. add add 16 bits to the counter value and store updated value in memory.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 287 of 554 the counter definition entry describes a set of counters that have similar characteristics and are referenced from the picocode as a single group. the following figure shows several counter definition examples. table 7-90. counter definition entry format field bits definition reserved 31:30 not used. 24/40 29 flag to indicate 24/40 counter (1 = 24/40). if set, 64 / 32 must also be set to 1. 64 / 32 28 flag to indicate a 64-bit counter (1 = 64) or a 32-bit counter (0 = 32). number of counters 27:23 number of counters within this counter set. legal values: 1, 2, 4, 8, or 16 counter resource location 22:20 storage used for the counter block. 000 bank a 001 bank b 010 bank c 011 bank d 100 internal memory base address 19:0 base address within the memory where the counter block starts. counter addresses are based on a 64-bit word, however, all address bits are not used for all counter resource locations and when a 32-bit counter is indi- cated (bit 28 is set to 0), address bit 0 indicates the counter location within the 64-bit word ( 0 = bits 31:0, 1 = bits 63:32). the following illustrates the use of the base address bits. 32-bit counter resource 64-bit word counter location address ---------- ------------------------- -------------- yes '000' - '011' 19:1 yes '100 ? 11:1 no '000' - '011' 19:0 no '100 ? 10:0
ibm powernp np4gs3 network processor preliminary embedded processor complex page 288 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 each counter definition entry can be used for a block of counter sets. the definition describes one counter set and the picocode can reference several consecutive counter sets using the same definition. for example, a counter set is defined as four counters: one for frames less than 500 bytes, one for frames between 500 and 1000 bytes, one for frames between 1001 and 1518 bytes, and the last one for frames greater than 1518. one counter definition entry describes the counters, and picocode can use the definition to reference 40 similar sets of these counters, that is, one for each source port. counter set 0 is located at the base address defined by the entry, counter set 1 is located at the next available address, and so on. figure 7-21 shows an example of counter blocks and sets. figure 7-21. counter definition entry counter definition entry 3 counter definition entry 8 counter definition entry 45 number of counters in set = 4 counter resource = bank a base address = x ? 0 1004 ? number of counters in set = 8 counter resource = bank c base address = x ? 0 1608 ? number of counters in set = 2 counter resource = internal base address = x ? 0 0044 ?
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 289 of 554 to reference the correct counter, the requestor must pass the counter manager several parameters. these parameters are described in table 7-91 . figure 7-22. counter blocks and sets counter definition entry 3 number of counters in set = 4 counter resource = bank a base address = x ? 0 1004 ? counter set 0 counter set 1 counter set 39 < 500 500 - 1000 1001 - 1518 > 1518 counter block 3
ibm powernp np4gs3 network processor preliminary embedded processor complex page 290 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 the counter manager builds the actual counter address using the following algorithm: address = base address + (counter set index * number of counters) + counter number this address is used to access the counter within the appropriate memory (bank a, bank b, bank c, bank d, or internal). when a 32-bit counter is accessed, the low order bit of the address selects which 32 bits of the 64-bit memory word are being used: 0 = bits 31:0, 1 = bits 63:32. the location of the counter and its size determine how many address bits are used by the counter manager as seen in table 7-92 . table 7-91. counter manager passed parameters parameter bits definition counter definition table index 8 counter definition entry to use for this action counter set index 20 set of counters to reference counter number 4 counter within the set to reference action 3 action to perform on the counter modify 000 increment by 1 001 add 16 bits to counter read 100 standard read 101 read then clear value write 110 write bits 15:0 of counter 111 write bits 31:16 of counter all other code points are reserved. add/write value 16 value to add to counter when modify/add selected value to write to counter when write selected flow control action (counter definition table index offset) 2 only used by flow control interfaces 00 standard enqueue 01 discard (resulting from the transmit probability table) 10 tail drop discard 11 reserved table 7-92. counter manager use of address bits memory location counter size number of counters stored at single memory address number of address bits used total counters possible internal 32 2 11 (10:0) where bit 0 selects upper or lower 32 bits 1k loc * 2 per = 2k 64, 24/40 1 10 (9:0) 1k loc * 1 per = 1k dram banks (a,b,c,ord) 32 2 20 (19:0) where bit 0 selects upper or lower 32 bits 512k loc * 2 per * 4 banks = 4m 64, 24/40 1 20 (19:0) 1m loc * 1 per * 4 banks = 4m
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 291 of 554 the counter manager supports a special mode of operation when used by the ingress or egress flow control. this mode allows ? delayed increments ? to occur based on flow control information. both flow controls pass an additional parameter, flow control action, that causes the counter manager to modify which counter definition entry is used. a standard enqueue sent by either flow control logic causes the counter manager to use the counter definition index that is passed with the request. a discard resulting from the transmit probability table causes the counter manager to access the counter definition entry that is located at index+1. a tail drop discard causes the counter manager to access the counter definition entry located at index+2. each type of flow control action uses a different counter definition. when the policy manager requests use of the counter manager, it always uses a counter definition table index of 0. this location is reserved for use by the policy manager. the counter manager arbitrates for a new counter action at a rate of one each 15 ns. when accessing the internal memory, the counter manager can update these counters at a rate of one every 15 ns. the external dram rates are once each 150 ns or each 165 ns depending on dram cycle configuration (10- or 11-cycle windows), and one to four counters (one per bank) will be updated during this time. if the counter manager is updating some (but not all) dram banks, the banks not being updated by the counter manager will have their last location (address = all 1's) written during the dram write window. therefore, the last location of each dram bank must be reserved for this use and cannot be assigned as a counter location. the counter manager supports the following actions: read, read/clear, write/lower, write/upper, increment and add. the dppus can read any counter in the counter manager. the flow control and policy manager logic do not perform counter reads. when a counter is read, the counter manager returns the read data (either 32 or 64 bits) with the acknowledge signal. the picocode can also clear the counter (set to zero) after the read is performed by passing the read/clear action. counter writes of 16 bits are supported and either bits 15:0 or bits 31:16 of the selected counter are set to the write value with all other bits set to zero (this feature is provided to assist in chaining unused counters into a free list). the counter manager also supports incrementing or adding to the selected counter. the add value can be up to 16 bits in size and will be added to the counter (either 32, 40, or 64 bits). when a 24/40 counter is incre- mented or added, the 24-bit portion of the counter is incremented and the 40-bit portion receives the incre- ment/add action. the counter manager supports the following maximum numbers of counters (actual number depends on size of dram used and the counter definition table information): up to 1 k 64-bit, or 2 k 32-bit internal counters, or some mix of 64 and 32-bit counters not to exceed 64 kb total size. up to 1 m 64-bit, or 1 m 32-bit external counters per dram bank, or some mix of 64- and 32-bit counters not to exceed 1 m total counters per bank.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 292 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001 7.10 semaphore manager the semaphore manager and coprocessor are available in np4gs3b (r2.0). the semaphore manager is centrally located within the epc and is controlled by a thread through a semaphore coprocessor. a semaphore is a mechanism for acquiring ownership or ? locking down ? an entity. in this implementation a semaphore is a 32-bit value. once a thread has exclusive ownership of a semaphore, it is guaranteed that there are no other threads that own a semaphore with the same value (though there may be other threads owning semaphores with different values). thus, other threads that also want ownership of a semaphore with the same value are blocked until the semaphore is unlocked. it is upon the programmer to attach a meaning to a semaphore. that is, the semaphore manager does not know what a semaphore represents - it is just a string of 32 bits. semaphores can be seen as having a 32-bit address space and the programmer can map this to anything, like the tree search memory, the data store, or the embedded powerpc. for example, a tree search memory can have the upper four bits of the semaphore value set to ? 0000 ? , an ingress data store address can have the upper four bits of the semaphore value set to ? 0001 ? , and the embedded powerpc mailbox may have the upper four bits set to ? 1111 ? . in np4gs3b (r2.0), when configured for ordered semaphores, a thread may have only one semaphore locked at a time. when the np4gs3b (r2.0) is configured for unordered semaphores only, then a thread may have two semaphores (unordered) locked at a time. unordered semaphores - when multiple threads request a semaphore, the semaphore manager will grant the request through a round-robin fairness algorithm. a thread can lock-and-unlock an unordered semaphore as often as it wants. for example, it can lock-and-unlock, execute some code, lock again, exe- cute more code, unlock, etc. ordered semaphores - when multiple threads request a semaphore, the semaphore manager will grant the request in the order of frame dispatch for a given flow. in other words, when a thread is processing a frame of a certain flow that has been dispatched before any other frames of the same flow, then it is guar- anteed by the semaphore manager that this thread will get a certain semaphore value before any other threads that are processing frames of the same flow. the dispatch order will be maintained by placing a ? reservation ? into a queue. only semaphore requests in which their reservation is on the top of the queue will be serviced. it will be the responsibility of the picocode to release a reservation if it isn ? t needed or a deadlock situation will occur. to use ordered semaphores, they must be enabled. this is done by writing the ordered semaphore enable register in the hardware classifier. there is one bit for the semaphore 0 id queue, and one bit for the sema- phore 1 id queue. at dispatch time, the hardware classifier may assign a label to a frame as per the port configuration table. if a label is assigned, the hardware classifier will also look at the ordered semaphore enable register to see if ordered semaphores are enabled as well. if no labels are assigned, ordered sema- phores will not be enabled. the label is treated as a 29-bit tag, allowing 2 29 flows. the frame flow queue and the semaphore ordering queues use the same label. if a frame order label is released (via an enqueue with label or release label), the semaphore order label must first have either been used or released, otherwise an error will occur. a thread can lock-and-unlock an ordered semaphore up to two times (this is an implementation restriction). during a request, the thread must specify the orderid, which is 0 or 1. an order queue can only be used once, with two order queues implemented for a total of two ordered semaphores per thread. in other words, a thread can only lock orderid 0 once, and orderid 1 once.
ibm powernp np4gs3 preliminary network processor np3_dl_sec07_epc.fm.08 may 18, 2001 embedded processor complex page 293 of 554 each thread can have ownership of up to two semaphores simultaneously. when using two semaphores, the programmer must take great care to avoid deadlock situations (semaphores can only be requested one by one). for example, a semaphore lock must be followed at some point by a semaphore unlock or a deadlock will occur. the semaphore manager employs a pending concept to ordered semaphore lock requests, which enables it to look past the head of the completion unit queues. the basic idea is if a thread is requesting an ordered lock for a particular semaphore value that is already locked, it will be moved into a pending state, and it ? s request will not be considered complete. at this point, the threads that are behind it in the flow queues can now be considered for locking as long as their semaphore values are different. once the original semaphore value that was locked is unlocked, then the pending thread ? s semaphore value will lock and that thread ? s request will be completed. it is important to note that only one thread per semaphore value will go into the pending state. the semaphore manager will process the three semaphore coprocessor command types in parallel: lock commands for ordered semaphores at the head of the queue and lock commands for unordered semaphores will be evaluated at the same priority level and a winner will be chosen using a round-robin fairness algorithm. - lock commands will be evaluated in parallel with unlock and reservation release commands. unlock commands - unlock commands will be evaluated in parallel with lock and reservation release commands. reservation release commands - reservation release commands will be evaluated in parallel with lock and unlock commands.
ibm powernp np4gs3 network processor preliminary embedded processor complex page 294 of 554 np3_dl_sec07_epc.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 295 of 554 8. tree search engine 8.1 overview the tree search engine (tse) is a hardware assist that performs table searches. tables in the network processor are maintained as patricia trees, with the termination of a search resulting in the address of a leaf page. the format of a leaf page or object is defined by the picocode; the object is placed into a control store, either internal (h0, h1) or external (z0, d0 - d3). 8.1.1 addressing control store (cs) references to the control store use a a 26-bit address as shown in table 8-1 . each address contains a memory id, a bank number, and address offset. table 8-2: cs address map and use on page 295 provides addressing information and recommended uses for each memory that is supported by the csa and that is accessible via the tse. the memory type provides access width and memory size information. selection for d0 as either single or double wide is by configura- tion ( 13.1.2 dram parameter register (dram_parm) on page 440). table 8-1. control store address mapping for tse references 26-bit address used to reference the control store 4 bits 2 bits 20 bits memory id bank number offset virtual bank address table 8-2. cs address map and use (page 1 of 2) memory id bank no. memory name type offset width (bits) recommended use 0000 00 null 20 01-11 reserved 0001 00 z0 external zbt sram 19 fast pscb 01-11 reserved 512k x 36 0010 00 h0 internal sram 2k x 128 11 fast leaf pages 01 h1 internal sram 2k x 36 11 fast internal sram 10 reserved 11 reserved 0011 00-11 reserved 0100 - 0111 00-11 reserved
ibm powernp np4gs3 network processor preliminary tree search engine page 296 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.1.2 d6 control store. the np4gs3 supports the following ddr dram types for the d6 control store: the d6 control store can be accessed either via the tse or by the embedded powerpc's processor local bus (plb) (see 10.2 processor local bus and device control register buses on page 366). 1000 00-11 d0 banks a - d ddr dram 18 - 20 single (d0_width = 0) or double wide (d0_width = 1) configuration. leaf pages d0_width = 0 8mb 16 mb 32 mb d0_width = 1 16 mb 32 mb 64mb 1001 00-11 d1 banks a - d ddr dram 18 - 20 leaf pages 1010 00-11 d2 banks a - d 8 mb leaf or counter pages 1011 00- 1 d3 banks a - d 16 mb dt entries and pscb entries 32 mb 1100 00- 1 d6 banks a - d ddr dram 8mb(r2.0) 16 mb (r2.0) 32 mb 64mb 128 mb 20 powerpc external memory 1101 1110 1111 note: ddr dram is specified as: number of banks x number of entries x burst access width (in bits). ddr dram total memory space (mb) notes 4x1mx16 x1chip 8 1 4x2mx16 x1chip 16 1 4x4mx4 x4 chips 32 4x4mx16 x1chip 32 1 4x8mx4 x4 chips 64 4x16mx4 x4chips 128 note: 1. np4gs3b (r2.0) or later table 8-2. cs address map and use (page 2 of 2) memory id bank no. memory name type offset width (bits) recommended use
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 297 of 554 8.1.3 logical memory views of d6 using the plb, the embedded powerpc or a pci attached host views a flat, contiguous address space addressable on byte boundaries. access via the tse uses the control store address mapping described in 8.1.1 addressing control store (cs) on page 295, and is composed of a memory id, bank number and offset. the offset limits addressing capability to 8-byte boundaries within the ddr dram. table 8-3 illustrates the plb to d6 control store address translation. the plb address is shown within the body of the table and the d6 control store address is formed using the memory id, bank number, and offset. d6 control stores implemented in ddr dram of up to 32 mb in size are accessed by the tse using only memory id '1100'. larger d6 control stores require memory ids of '1101' (up to 64 mb), '1110' (up to 96 mb) and '1111' (up to 128 mb). a method to share memory objects between the embedded powerpc or attached pci host using the plb, and picocode running on the np4gs3 using tse instructions is to specify the object (as viewed by the tse) with a height of one and a width of four. this allows the tse access to all of the d6 control store, on 32-byte boundaries, and is consistent with the plb's flat contiguous address space view. successive objects with a height of one and a width of four are accessed by incrementing the tse address until a 32mb boundary is reached. at the boundary it is necessary to increment the memory id to address the next successive object. table 8-3. plb and d6 control store addressing memory id (4 bits) bank a ('00') bank b ('01') bank c ('10') bank c ('11') offset (20 bits) '1100' 0-7 8-x'f x'10-x'17 x'18 - x'1f x ? 00 0000 ? x'20-x'27 x'28-x'2f x'30-x'37 x'38-x'3f x ? 00 0001 ? x'07f ffe0 - x'07f ffe7 x'07f ffe8 - x'07f ffef x'07f fff0 - x'07f fff7 x'07f fff8 - x'07fffff(8mb-1) x ? 03 ffff ? x'0ff ffe0 - x'0ff ffe7 x'0ff ffe8 - x'0ff ffef x'0ff fff0 - x'0ff fff7 x'0ff fff8 - x'0ff ffff (16mb-1) x ? 07 ffff ? x'1ff ffe0 - x'1ff ffe7 x'1ff ffe8 - x'1ff ffef x'1ff fff0 - x'1ff fff7 x'1ff fff8 - x'1ff ffff (32mb-1) x ? 0f ffff ? '1101' x'200 0000 - x'200 0007 x'200 0008 - x'200 000f x'200 0010 - x'200 0017 x'200 0018 - x'200 001f x ? 00 0000 ? x'3ff ffe0 - x'3ff ffe7 x'3ff ffe8 - x'3ff ffef x'2ff fff0 - x'2ff fff7 x'3ff fff8 - x'3ff ffff (64mb-1) x ? 0f ffff ? x'400 0000 - x'400 0007 x'400 0008 - x'400 000f x'400 0010 - x'400 0017 x'400 0018 - x'400 001f x ? 00 0000 ? x'5ff ffe0 - x'5ff ffe7 x'5ff ffe8 - x'5ff ffef x'5ff fff0 - x'5ff fff7 x'5ff fff8 - x'5ff ffff (96mb-1) x ? 0f ffff ? '1111' x'600 0000 - x'600 0007 x'600 0008 - x'600 000f x'600 0010 - x'600 0017 x'600 0018 - x'600 001f x ? 00 0000 ? x'7ff ffe0 - x'7ff ffe7 x'7ff ffe8 - x'7ff ffef x'7ff fff0 - x'7ff fff7 x'7ff fff8 - x'7ff ffff (128mb-1) x ? 0f ffff ?
ibm powernp np4gs3 network processor preliminary tree search engine page 298 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.1.4 control store use restrictions the following restrictions apply: when d2 is used by the counter manager, the last location (highest address) in each bank must not be assigned for any use. dt entries can be stored only in h1, z0, or d3. for np4gs3a (r1.1), lpm dt entries are restricted to d3 only. pscbs can be stored only in h1, z0, or d3. leaf pages can be stored only in h0, d0, d1, and d2. 8.1.5 object shapes object shapes specify how the control store stores an object such as a leaf or pattern search control block (pscb). a leaf is a control block that contains the reference key as a reference pattern. the pattern uniquely identifies the leaf in a tree and contains the data needed by the application initiating a tree search. the data is application-dependent, and its size or memory requirements are defined by the ludeftable entry for the tree. see table 8-5: height, width, and offset restrictions for tse objects on page 301 and table 8-17. ludeftable entry definitions on page 312 for details. shape is defined by two parameters, height and width. objects small enough to fit within a single memory or bank location are defined as having a height and width of 1 (denoted by 1,1), and therefore do not require shaping. for example, both a 32-bit and a 48-bit object stored in a ddr sdram bank would have a shape of (1,1). when objects do not fit into a single memory or bank location, they have heights and/or widths > 1: height denotes the number of consecutive address locations in which an object is stored. for example, if the height of an object = 4 and the control store address = a, the object is stored in locations a, a+1, a+2, and a+3. width denotes the number of consecutive banks in which the object is stored. width always = 1 for objects stored in zbt sram, and could be > 1 for objects in ddr sdram. for example, if the width of an object = 3, and its height = 1, the object is stored in three consecutive banks (the virtual bank address is incremented by 1). an offset increment can carry out into the bank address bits. this is not a supported use of the cs mem- ory, and table allocation algorithms must be defined to avoid this condition. table 8-4. dtentry, pscb, and leaf shaping object shape fm dtentry/pscb always has a shape of height = 1, width = 1. smt dtentry/pscb always has a shape of height = 1, width = 1. lpm dtentry/pscb has a shape of height = 1, width = 1 or height = 2, width = 1 depending on the memory in which the pscb resides. a memory with a line width of at least 64 bits should be used with height = 1 and a memory of 36 bits should be used with height = 2. leaf can have any shape that is allowed by the memory in which the leaf is located - maximum of 512 bits.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 299 of 554 for height and width, the hardware automatically reads the appropriate number of locations. from a pico- code point of view, an object is an atomic unit of access. restrictions to height, width, and offset are given in table 8-5: height, width, and offset restrictions for tse objects on page 301 . table 8-17: ludeftable entry definitions on page 312 specifies the leaf and pscb shapes. figure 8-1. example shaping dimensions on page 300 illustrates some example placement of objects with different shapes in control store. an object may span more than one dram as shown in examples (b) and (c) with the following restrictions: an object may span d1 and d2. when d0 is configured as a single wide (d0_width = 0), then an object may span d0 and d1. objects may not span out of d0 when d0 is configured as a double wide (d0_width = 1). objects may not span into or out of z0, h0, h1, d3 or d6. shape object can define a maximum of 512 bits for any memory.
ibm powernp np4gs3 network processor preliminary tree search engine page 300 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 figure 8-1. example shaping dimensions (2,1) (3,1) 128 h0 h=1,w=1 h=5,w=1 (note: 4 bits unused) 32 z0 h=1,w=1 36 d1 (height = 1,width = 4) 64 d2 (1,2) (3,1) 64 64 64 (height = 1,width = 3) d1 (height = 1,width = 4) d2 (3,1) (height = 1,width = 3) (1,1) (1,1) (1,1) ab c d ab c d ab c d ab c d 64 64 64 64 64 64 64 64 64 64 64 64 h=3,w=1 example (a) example (b) example (c)
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 301 of 554 8.1.6 illegal memory access when the tse uses an undefined memoryid (that is, reserved) or an illegal memory shape, the tse aborts the current command, returns a ko status, and sets an exception flag at bit 2 in interrupt class 3 (tsm_illegal_memory_access). the exception flag can be set to cause an interrupt by setting bit 2 of inter- rupt mask 3 to ? 1 ? (tsm_illegal_memory_access). for debugging purposes, an exception can switch a programmable set of threads into single-step mode. table 8-5. height, width, and offset restrictions for tse objects memory height width total object size (bits) control store address offset must be divisible by h0 1 2 3 4 1 1 1 1 128 256 384 512 1 2 4 4 h1 1 2 3 4 5 6 7 8 1 1 1 1 1 1 1 1 36 64 96 128 160 192 224 256 1 2 4 4 8 8 8 8 z0 1 2 3 4 5 6 7 8 1 1 1 1 1 1 1 1 36 64 96 128 160 192 224 256 1 2 4 4 8 8 8 8 d0 (d0_width = 1; double wide configuration) 1 2 3 4 1 2 1 1 1 1 1 1 2 2 3 4 128 256 384 512 256 512 384 512 1 2 4 4 1 2 1 1 d0-d1-d2-d3-d6 (d0_width = 0; single wide configuration) 1 2 3 4 5 6 7 8 1 2 3 4 1 2 1 2 1 1 1 1 1 1 1 1 2 2 2 2 3 3 4 4 64 128 192 256 320 384 448 512 128 256 384 512 192 384 256 512 1 2 4 4 8 8 8 8 1 2 4 4 1 2 1 2
ibm powernp np4gs3 network processor preliminary tree search engine page 302 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.1.7 memory range checking (address bounds check) memory range checking can flag access to a programmable range within min_addr_bounds and max_addr_bounds for read, write or both using addr_bounds_cntl. min_addr_bounds. memory range checking can be performed for any tse control store accesses only. when memory range checking is enabled, any control store read, write, or read/write address falling outside the defined range, generates an exception (an exception does not stop tse operation) and sets an exception flag at bit 1 in interrupt class 3 (tsm_addr_range_violation). the exception flag can be set to cause an interrupt by setting bit 1 of interrupt mask 3 to ? 1 ? (tsm_addr_range_violation). for debug purpose, bounds violation register (bounds_vio), indicates which gxh has caused an address bounds exception.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 303 of 554 8.2 trees and tree searches the tse uses trees to store and retrieve information. tree searches, retrievals, inserts and deletes are performed according to a key that is similar to a mac source address or a concatenation of an ip source and destination address. information is stored in one or more leaves that contain the key as a reference pattern and, typically, contain aging and user information for forwarding purposes such as target blade and target port numbers. to locate a leaf, a search algorithm processes input parameters that include the key and hashes the key. the algorithm then accesses a direct table (dt) and walks the tree through the pscbs. there are three types of trees, each with its own search algorithm and tree-walk rules: full match (fm); longest prefix match (lpm); and software managed (smt). the data structure of fm and lpm trees is the patricia tree. when a leaf is found, the leaf is the only candi- date that can match the input key. a ? compare-at-end ? operation compares the input key with a reference pattern stored in the leaf to verify a match. search results are ? ok ? when a match is found and ? ko ? in all other cases. the data structure of smt trees is similar to that of fm trees, but smt trees can have multiple leaves that can be chained in a linked list. all leaves in the chain are checked against the input key until a match is found or the chain is exhausted. search results are ? ok ? when a match is found and ? ko ? in all other cases. table 8-6. fm and lpm tree fixed leaf formats field name byte length description nlarope 4 leaf chaining pointer, aging, and direct leaf information prefix_len 1 length of the pattern (in bits) for lpm only. not used by tse for fm trees and can be used by picocode. pattern 2-24 pattern to be compared with hashedkey. always present. length given by p1p2_max_size from table 8-17: ludeftable entry definitions on page 312. for fm: unused bit between (p1p2_max_size - hashedkey_length) must be ini- tialize to zero. for lpm: unused bits between (p1p2_max_size - prefix_len), do not have to be initialized and can be used by user data since these bits do not take part in com- parsion. userdata variable under picocode control. for example, field can include one or more counters. ta b l e 8 - 7 . s m t tr e e f i xe d l e a f fo r m a t s field name byte length description nlasmt 4 leaf chaining pointer to chain leaves for smt. includes shape of chained leaf. comp_table_index 1 defines index in compdeftable that defines compare between pattern1, pattern2, and hashedkey. pattern1 and pattern2 4-48 contains pattern1 and pattern2 bitwise interleaved (even bits represent pattern1 and odd bits represent pattern2). that is, bit 0 of the field contains bit 0 of pattern1, bit 1 contains bit 0 of pattern 2, etc. length given by 2*p1p2_max_size from table 8-17: ludeftable entry defini- tions on page 312. for smt: unused bit between (2 * p1p2_max_size - 2 * hashedkey_length) must be initialized to zero. userdata variable under picocode control. for example, field can include one or more counters.
ibm powernp np4gs3 network processor preliminary tree search engine page 304 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.1 input key and color register for fm and lpm trees for fm and lpm trees, the input key is hashed into a hashedkey according to the hash algorithm specified by the hash_type field in the ludeftable. to minimize the depth of the tree that begins after the direct table, the hash function output is always a 192-bit number with a one-to-one correspondence to the original input key. maximum output entropy is contained in the hash function ? s most significant bits. the n highest bits of the hashedkey register are used to calculate an index into the direct table, where n is determined by the defi- nition of the dt (direct table) entry for the tree. when colors are enabled for a tree, the 16-bit color register is inserted after the input key has been hashed. this occurs immediately after the direct table. if the direct table contains 2 n entries, the 16-bit color value is inserted at bit position n. the hash function output and the inserted color value (when enabled) are stored in the hashedkey register. when colors are disabled, the 192-bit hash function is unmodified. colors can be used to share a single direct table among multiple independent trees. for example, color could indicate a vlanid in a mac source address table. the input key would be the mac sa and the color the vlanid (vlanid is 12 bits and 4 bits of the color would be unused, that is, set to 0). after the hash function, the pattern would be 48 + 16, or 64 bits. the color would be part of the pattern to distinguish mac addresses of different vlans. 8.2.2 input key and color register for smt trees for smt trees, the input key is a 192-bit pattern and the color register is ignored. no hashing is performed. table 8-8. search input parameters parameter bit length description key 192 key must be stored in the shared memory pool. all 192 bits must be set cor- rectly (for key length shorter than 192, remaining bits in shared memory pool must be set to 0). key length 8 contains key length minus 1 in bits. ludefindex 8 index into ludeftable that points to an entry containing a full definition of the tree in which the search occurs. see section 8.2.5.1 the ludeftable on page 312. tsedpa 4 tse thread shared memory pool address - stores location of key, keylength, and color and determines leaf destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. lcbanr 1 search results can be stored in either tsrx, as specified by tsedpa. leaf addresses are stored in lcba0 or lcba1 as specified by lcbanr. during a tse search, picocode can access the other tsr to analyze the results of previ- ous searches. color 16 for trees with color enabled, as specified in the ludeftable, the contents of the color register are inserted into the key during hash operation. see section 8.2.1 input key and color register for fm and lpm trees on page 304 for an explanation of the process.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 305 of 554 8.2.3 direct table a search starts when a dtentry is read from the direct table. the read address is calculated from the n highest bits of the hashedkey and from the tree properties defined in the ludeftable. the dtentry can be represented as the root of a tree, with the actual tree data structure depending upon tree type. a patricia tree data structure is used with fm trees. extensions to the patricia tree are used with lpms and smts. using a dt can reduce search time (pscb access time). increasing dt size is a trade-off between memory usage and search performance. when a single leaf is attached to a dtentry, the read data includes a pointer to the leaf. when more than one leaf is attached, the read data defines the root of a tree. when the dtentry is empty, no leaf information is attached. 8.2.3.1 pattern search control blocks (pscb) a search begins when a dtentry has been read if the dtentry is neither empty nor contains a direct leaf. a tree walk search starts at the dtentry and passes one or more pscbs until a leaf is found. for an fm tree, the pscb represents a node in the tree, or the starting point of two branches, ? 0 ? and ? 1 ? .eachpscbis associated with a bit position ? p ? in the hashedkey. bit p is the next bit to test (nbt) value stored in the previous pscb or in the dtentry. leaves reachable from a pscb through the 0 branch have a ? 0 ? in bit p, and leaves reachable through the 1 branch have a ? 1 ? . leaves reachable through either branch have patterns where bits 0..p-1 are identical, because pattern differences begin at bit p. when an fm tree search encounters a pscb, the tse continues the tree walk on the 0 or 1 branch depending on the value of bit p. thus, pscbs are only inserted in the tree at positions where leaf patterns differ. this allows efficient search operations since the number of pscbs, and thus the search performance, depends on the number of leaves in a tree, not on the length of the patterns. figure 8-2. effects of using a direct table direct table (dt) data structure without a direct table data structure with a direct table leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf leaf
ibm powernp np4gs3 network processor preliminary tree search engine page 306 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.3.2 leaves and compare-at-end operation the entire hashedkey is stored in the leaf as a reference pattern, not as the original input key. during a tree walk, only the hashedkey bits for which a pscb exists are tested. when an fm leaf is found, its reference pattern must be compared to the full hashedkey to make sure all the bits match. for smt, if the leaf contains a chain pointer or nla field to another leaf, the new leaf ? s reference pattern is compared to the hashedkey. lacking a match or another nla field, the search ends and the failure is indicated by a ko status. if the pattern matches, the original input key is checked. if that matches, the whole leaf page is returned to the network processor. if there is no match, the leaf page is returned with a no-match message. 8.2.3.3 cascade/cache the direct table can be used as a cache to increase tree search performance since these trees are generally small and contain most likely entries. during a search, the tse first determines whether the dt contains a pointer to a leaf matching the hashedkey. if so, the leaf is returned, eliminating the need for a search. to the tse, a cache lookup is identical to a normal search, that is, the input key is hashed into a hashedkey and the dt is accessed. cascades/caches are enabled in the ludeftable on a per-tree basis. if a cache search uses ludeftable entry i and the search ends with ko, another search using ludeftable entry i+1 starts automatically. this allow multiple search chaining, although the full tree should be stored under ludeftable entry i+1. cascading/caching of more than one ludeftable entry is not supported in the current design, therefore the ludeftable entry i + 1 must have the cache enable bit set to ? 0 ? (i.e., disabled). also, whenever cascade/ cache is enabled, the useludefcopyreg operand option for gth should be set to ? 0 ? for current version of the device. 8.2.3.4 cache flag and nrpscbs registers picocode initiates insert and delete operations to and from the cache. each search result stores information about the cache in the cacheflag and nrpscbs registers as shown in table 8-9 . each register is divided into two sections, one for searches using tse 0 coprocessor and the other for searches using tse 1 copro- cessor. there are two copies of these registers for each tse 0 and tse 1 resource. table 8-9. cache status registers register bit length description cacheflag(0) 1 cacheempty bit. set when cache search finds an empty dtentry in cache dt. cacheflag(1) 1 cacheleaffound bit. set when cache search finds a leaf and cache search returns ok. when leaf found bit has been set, full search has not been per- formed. cacheflag(2) 1 cacheko bit. set when cache search returns ko. when cache is empty, this bit is also set. nrpscbs 8 after any search, contains the number of pscbs read during a tree walk. when a cache search finds no leaf and a full search starts, contains the number of pscbs read during the search.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 307 of 554 8.2.3.5 cache management cache management is performed using picocode. cache inserts are controlled by inspecting the cacheflag and nrpscbs registers after each tree search. inserts are treated like normal fm tree inserts, allowing the association of multiple leaves with a single dtentry, because normal fm inserts create pscbs to handle multiple leaves. inserts can also be done by writing directly to a dtentry, although only using single leaves. cache deletes use the tree aging mechanism whereby every n seconds all entries in the cache are deleted. 8.2.3.6 search output the output of a search operation consists of the parameters listed in table 8-10 . 8.2.4 tree search algorithms the tse provides hardware search operations for fm, lpm, and smt trees. software initializes and main- tains trees. leaves can be inserted into and removed from fm and lpm trees without control point function (cpf) intervention, permitting scalable configurations with cpf control when needed. 8.2.4.1 fm trees full match (fm) trees provide a mechanism to search tables with fixed size patterns, such as a layer 2 ethernet unicast mac tables which use fixed six-byte address patterns. searches of fm trees are efficient because fm trees benefit from hash functions. the tse offers multiple fixed hash functions that provide very low collision rates. each dtentry is 36 bits wide and contains the formats listed in table 8-11 .pscbshavethesamestructure as dtentrys except they contain two pscblines, each of which can have one of the two pointer formats listed in this table : next pointer address (npa) or leaf control block address (lcba) . the two pscblines are allocated consecutively in memory and are used as walking branches in the tree. the nbt value signifies which of the two pscblines is used. table 8-10. search output parameters parameter description ok/ko flag 0 ko: unsuccessful operation 1 ok: successful operation that is, leaf pattern matches hashedkey tsrx shared memory pool leaf contents are stored at tsedpa address of the shared memory pool, specified by the tree search command operand. lcba0 / 1 scalar registers leaf address is stored in lcba0 / 1 based on lcbanr input value. cacheflags(2) cacheempty bit cacheflags(1) cacheleaffound bit cacheflags(0) cacheko bit nrpscbs number of pscbs read during last search. can be used as a criterion to insert fm cache entry.
ibm powernp np4gs3 network processor preliminary tree search engine page 308 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.4.2 lpm trees longest prefix match (lpm) trees provide a mechanism to search tables with variable length patterns or prefixes, such as a layer 3 ip forwarding table where ip addresses can be full match host addresses or prefixes for network addresses. the cpf manages lpm trees with assistance from the gth for inserting and removing leaf entries. lpm dtentrys, each of which can contain a node address, an npa, and an lcba, differ from fm dtentrys which cannot contain both a node and leaf address. each dtentry is 64 bits wide and contains the formats listed in table 8-12: lpm dtentry and pscbline formats on page 308. pscbs have the same structure as dtentrys except pscbs contain two pscblines, each of which can have one of the three lcba formats listed in the table. the two pscblines are allocated consecutively in memory and are used as walking branches in the tree. one of the pscb lines may be empty, whichisnotallowedinfmpscbs. 8.2.4.3 smt trees software-managed (smt) trees provide a mechanism to create trees that follow a control point function (cpf)-defined search algorithm such as an ip quintuple filtering table containing ipsa, ipda, source port, destination port, and protocol. smt trees use the same pscbs as fm trees, but only the first leaf following a pscb is shaped by the table 8-17: ludeftable entry definitions on page 312. the following leaves in a leaf chain are shaped according to the five bits in the chaining pointer contained in the nlasmt leaf field (see table 8-13 ). unlike fm and lpm, smt trees allow leaves to specify ranges, for instance, that a source port must be in the range of 100..110. smt trees always contain two patterns of the same length in a leaf to define a comparison range. when the first leaf is found after a pscb, a compare-at-end operation is performed. if ok, the search stops. if the comparison returns ko and the nlasmt field is non-zero, the next leaf is read and another compare-at-end operation is performed. this process continues until an ? ok ? is returned or until the nlasmt field is zero, which returns a ko. table 8-11. dtentry and pscbline formats format conditions valid in dtentry? valid in pscb? format (2 bits) npa/lcba (26 bits) nbt (8 bits) empty dtentry no leaves yes no 00 0 0 pointer to next pscb dtentry contains pointer yes yes 00 npa nbt pointer to leaf single leaf associated with dtentry; lcba field contains pointer yes yes 01 lcba 0 table 8-12. lpm dtentry and pscbline formats format conditions valid in dtentry? valid in pscb? format (2 bits) npa (26 bits) nbt (8 bits) lcba (26 bits) spare (2 bits) empty dtentry no leaves yes yes 00 0 0 0 0 lcba not valid dtentry contains pointer to pscb yes yes 00 npa nbt 0 0 lcba valid; npa/ nbt not valid single leaf associated with dtentry; lcba contains pointer to leaf; no pointer to next pscb yes yes 01 0 0 lcba 0 lcba valid; npa/ nbt valid single leaf associated with dtentry; lcba contains pointer to leaf; pointer to next pscb yes yes 01 npa nbt lcba 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 309 of 554 8.2.4.4 compare-at-end operation the input key and the two reference patterns stored in each leaf can be logically divided into multiple fields (see figure 8-3 on page 309). one of two comparisons can be performed on each field: 1. compare under mask. the input key bits are compared to pattern0 bits under a mask specified in pattern1. a ? 1 ? in the mask means the corresponding bit in the input key must equal the corresponding bit in pattern0. a ? 0 ? means the corresponding bit in the input key has no influence on the comparison. the entire field matches only when all bits match, in which case the tse returns ok. 2. compare under range. the input key bits are treated as an integer and checked to determine whether the integer is within the range specified by min and max, inclusive. if so, the tse returns ok, otherwise the tse returns ko. when all fields return ok, the entire compare-at-end returns ok. otherwise, ko returns. this operation uses comptabledef for definition and type of compare indication. logical field definitions are specified in the compare definition table compdeftable. table 8-14 shows the entry format for the compdeftable. fields are compare-under-mask unless specified by the compdeftable. each entry specifies one or two range comparisons, although multiple entries can specify more than two range comparisons. each range comparison is defined by offset and length parameters. the offset, which is the position of the first bit in the field, must be at a 16-bit boundary and have a value of 0, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 174, or 192. the length of the field must be 8, 16, 24, or 32 bits. table 8-13. nlasmt field format 1 bit 2 bits 3 bits 26 bits reserved width of next leaf height of next leaf nla (next leaf address) figure 8-3. example input key and leaf pattern fields ipsa (32 bits) ipda (32 bits) srcport (16 bits) dstport (16 bits) prot (8 bits) input key value value min min value leaf pattern0 mask mask max max mask leaf pattern1 minmax field1 offset field1 offset field0 length minmax field0 length
ibm powernp np4gs3 network processor preliminary tree search engine page 310 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 in the input key shown in figure 8-3: example input key and leaf pattern fields on page 309, the compare- under-range for the source port (srcport) field would have offset0 set to 64 and minmaxlength0 set to 16. the compare-under-range for the destination (dstport) field would have offset1 set to 80 and minmaxlength1 set to 16. if more than two range comparisons are required, the continue bit would be set to 1 so the next compdeftable entry could be used for additional compare-under-range definitions. the compdeftable index used for tree comparisons is specified in the leaf comp_table_index field (see table 8-7: smt tree fixed leaf formats on page 303). each compare-under-range operation takes one clock cycle, so use as few compare-under-range operations as possible. if a range is a power of two, or 128-255, no compare-under-range is required since this range can be compared using compare-under-mask. when a compare-at-end fails and the smtnla leaf field is not 0, the tse reads the next leaf and performs additional compare-at-end operations until the compare returns ok or until the smtnla is 0. 8.2.4.5 ropes as shown in the following figure, leaves in a tree can be linked together in a circular list called a rope. the first field in a leaf is the chaining pointer, or the nlarope. picocode can ? walk the rope, ? or sequentially inspect all leaves in a rope. ropes can be created by setting the nlarope_en bit to ? 1 ? in the ludeftable. see table 8- 15: ludeftable rope parameters on page 311. table 8-14. compdeftable entry format field range bit length value range 1 offset (r1_offset) 1 8 starting bit position for first range compare. it is defined by compare_table(35 to 28) bits but only 4 upper bits are used to define the offset i.e (bits 35 to 32) and bits (31 to 28) are ignored. therefore, this field must be in 16 bit boundary range 1 length (r1_len) 1 8 length and number of bits to be used as a part of range compare starting/ beginning from r1_offset.this field is specified as the number of bits multi- plied by 4. it is defined by compare_table(27 to 20) bits, but only 3 msbs are used to define the length (i.e. bits 27,26 & 25) and bits 24 to 20 are ignored. range 2 offset (r2_offset) 2 8 starting bit position for second range compare. it is defined by compare_table(19 to 12) bits but only 4 upper bits are used to define the off- set; i.e bits 19 to 16 and bits 15 to 12 are ignored. therefore, this field must be in 16-bit bound. range 2 length (r2_len) 2 8 length and number of bits to be used as a part of range compare starting/ beginning from r2_offset. this field is specified as the number of bits multi- plied by 4. it is defined by compare_table(11 to 4) bits but only 3 msbs are used to define the length (i.e. bits 11 to 9) and bits (8 to 4) are ignored. range 2 valid (r2_valid) -- 1 range 2 valid value. this field indicates whether or not the range 2 offset and length fields are valid. 0 range 1 offset and length valid, but range 2 offset and length not valid. 1 range 1 and range 2 offset and length all valid. it is defined by compare_table(3) bit. continue (cont) -- 1 continue indicator value. this field indicates whether the compare opera- tion is continued in the next sequential entry of the table. 0 comparison not continued 1 comparison continued it is defined by compare_table(2) bit.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 311 of 554 leaf insertion is always done between a ropecla and ropepla. after insertion, the ropepla is unchanged and the ropecla points to the newly inserted leaf. leaf deletion is done two ways: 1. when the rope is not being walked, the rope is a single-chained linked list without a previous pointer. leaves cannot be deleted without breaking the linked list. setting the deletepending bit postpones the deletion until the rope is walked again. a leaf is deleted by setting the deletepending bit of the nlarope field. in this case, leaves are completely deleted and leaf pattern searches will return a ko. 2. when the rope is being walked and the deletepending bit is set, the tse deletes the leaf automatically. 8.2.4.6 aging aging is enabled whenever a rope is created. the nlarope field contains one aging bit that is set to ? 1 ? when a leaf is created. when a leaf that matches the leaf pattern is found during a tree search, the tse sets the aging bit to ? 1 ? if it has not previously done so, and then writes this information to the control store. an aging function controlled by a timer or other device can walk the rope to delete leaves with aging bits set to ? 0 ? and then write the leaves to the control store with aging bits set to ? 0 ? . when no aging is desired, picocode should not alter the aging bit, since bits set to ? 1 ? cannot be changed. figure 8-4. rope structure table 8-15. ludeftable rope parameters parameter description ropecla current leaf address in the rope. all tse instructions related to the rope such as rclr, ardl and tlir (see section 8.2.8 gth hardware assist instructions on page 336) relate to the leaf addressed by ropecla. ropepla previous leaf address in the rope. always the previous leaf if the rope is related to ropecla unless the rope contains no leaf or one leaf. the following condition is always true for trees with two or more leaves: ropepla -> nlarope == ropecla. leafcnt leaf count - number of leaves in the tree leafth leaf threshold causes an exception to be generated when the leafcnt exceeds the threshold. leaf1 ropecla th leaf2 leaf3 leaf4 leaf0 ropepla leafcnt part of ludeftable
ibm powernp np4gs3 network processor preliminary tree search engine page 312 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.5 tree configuration and initialization 8.2.5.1 the ludeftable the lookup definition table (ludeftable), an internal memory structure that contains 128 entries to define 128 trees, is the main structure that manages the control store. the table indicates in which memory (ddr- sdram, sram, or internal ram) trees exist, whether caching is enabled, key and leaf sizes, and the search type to be performed. each dtentry contains two indexes to the pattern search and leaf free queue (pscb_leaf_fq). the first index defines which memory to use to create tree nodes, that is, where pscbs are located. the second defines which memory to use to create leaves within a tree. when an entry is added to a tree, the memory required for the insert comes from the pscb_leaf_fq and is returned when the entry is deleted. for smt trees, an lu_def_tbl can be defined to match a value into a given range, but an index to an internal compare type index must be given for the compare table (cmp_tbl). table 8-16. nlarope field format 2 bits 1 bit 2 bits 1 bit 26 bits reserved aging counter reserved delete pending nla (next leaf address) table 8-17. ludeftable entry definitions (page 1 of 3) field bit length description cascade_entry/ cache_entry 1 0 normal tree entry (not a cache entry) 1 a cascade/cache entry. when a search returns with ko, and cache_entry = 1, ts instructions will restart using the next entry in the ludeftable (that is, ludefindex + 1). special note: ludefindex +1 entry must have this bit set to ? 0 ? (i.e., disabled). search_type 2 search type value. this field indicates the type of search to be performed. 00 full match (fm). 01 longest prefix match (lpm). 10 software managed tree longest prefix match (smt lpm). 11 reserved. hash_type 4 hash type 0000 no hash 0001 192-bit ip hash 0010 192-bit mac hash 0011 192-bit network dispatch hash 0100 no hash 0101 48-bit mac swap 0110 60-bit mac swap 0111 reserved 1000 8-bit hash 1001 12-bit hash 1010 16-bit hash special note: when using hash_type 1000, 1001, and 1010, hashedkey and hashedkey_length will be extended by 8, 12, or 16 bits, respectively. color_en 1 color enable control value. this field controls whether the color value is used to form the hashed key value. 0 color register not used. 1 color register is used
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 313 of 554 p1p2_max_size 5 maximum size of pattern1 and pattern2 in leaf - indicates size in half words (that is, 16 bits) reserved for the ? pattern ? as a part of leaf in bytes. the maximum pattern size is 12, which represents 192 bits. p1p2_max_size mapping: 00000 0 byte (no pattern) 00001 2 byte 00010 4 bytes 00011 6 bytes 00100 8 bytes 00101 10 bytes 00110 12 bytes 00111 14 bytes 01000 16 bytes 01001 18 bytes 01010 20 bytes 01011 22 bytes 01100 24 bytes others 0 byte (reserved) note: for fm: unused bits between p1p2_max_size - hkey_length must be initilized to zero. for lpm: unused bits between (p1p2_max_size - prefix_length) do not have to be initialized and can be used by userdata since these bits do not take part in the com- parsion. for smt: unused bit between (2 * p1p2_max_size - 2 * hashedkey_length) must be initialized to zero. nlarope_en 1 nla rope enable control value. this field indicates whether or not the leaf contains a next leaf address rope (nla rope) field. 0 leaf does not contain nlarope field 1 leaf contains nlarope field (enables aging) pscb_fq_index 6 pattern search control block free queue index value. this is the index of the pscb free list. pscb_height 1 height (part of the shape) of a pscbentry. width of a pscb is always 1. 0 height = 1 (fm/smt zbt sram, fm/smt h1 sram, fm/lpm/smt ddr sdram) 1 height = 2 (lpm zbt sram, lpm h1 sram) in np4gs3b (r2.0), this field is unused and does not affect operation. dt_base_addr 26 direct table base address value. table 8-17. ludeftable entry definitions (page 2 of 3) field bit length description
ibm powernp np4gs3 network processor preliminary tree search engine page 314 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.5.2 tse free lists (tse_fl) the gth has access to a set of 64 tse free list control blocks, each defining a free list using the format shown in table 8-18 . free lists typically chain unused pscbs and leaves into a linked list, but can also be used by picocode to manage memory. the link pointer is stored in the control store at the address of the dt_size 4 direct table size value - defines the number of dt entries within a memory bank. 0001 4 entries, (2 bits) 0010 16 entries, (4 bits) 0011 64 entries, (6 bits) 0100 256 entries, (8 bits) 0101 1 k entries, (10 bits) 0110 4 k entries, (12 bits) 0111 16 k entries, (14 bits) 1000 64k entries (16 bits), np4gs3a (r1.1); 32k entries (16 bits), np4gs3b (r2.0) 1001 256k entries (18 bits), np4gs3a (r1.1); 64k entries (14 bits), np4gs3b (r2.0) 1010 1m entries (20 bits), np4gs3a (r1.1); 128k entries (17 bits), np4gs3b (r2.0) 1011 256k entries, (18 bits) np4gs3b (r2.0) 1100 512 k entries, (19 bits) np4gs3b (r2.0) 1101 1 m, (20 bits) np4gs3b (r2.0) others reserved leaf_fq_index 6 defines index of leaf free list leaf_width 2 leaf width leaf_height 3 leaf height ludef_state 2 ludef state - used by product software/user code to indicate current status of the entry. these bits do not affect tree search operation. ludef_invalid 1 indicates whether or not this entry is valid for normal tree search command or not. this bit blocks any tree search while the tree is being built or swapped. 0 valid 1 invalid when the ludef read request initiated by the tree search command finds this bit set to ? 1 ? , it will re-read this ludefentry every 16 cycles until it is set to valid. this halts the tree search for this particular thread. following fields are valid for gth only ropecla 26 current leaf address for rope ropepla 26 previous leaf address for rope leafcnt 26 number of leaves in tree leafth 10 threshold for the number of leaves in a tree. if the leafcnt is greater than or equal to this assigned threshold, then a class 0 interrupt is generated (bit 9 of the class 0 interrupt vector). for np4gs3a (r1.1), whenever the ludef table entry is read, checking of threshold is performed. this checking is not restricted to rope com- mands. therefore, this checking happens for all the commands (e.g., tree search). for np4gs3b (r2.0), checking of the threshold is performed only for the tlir rope command. the threshold is formed by the generation of two 26-bit numbers that are bit-wise ored resulting in a threshold value that is compared to the leafcnt: threshold(25:0) = 2**(leafth(9:5) )or 2**(leafth(4:0)) table 8-17. ludeftable entry definitions (page 3 of 3) field bit length description
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 315 of 554 previous list object. objects stored in different memories and with different shapes should be placed in different free lists. for example, a list of 64-bit pscbs stored in both zbt sram and internal ram should have different entries. the gth thread executes the tsenqfl and tsdqfl commands to enqueue and dequeue addresses on a free list. see section 8.2.8.3 tree search enqueue free list (tsenqfl) on page 338 and section 8.2.8.4 tree search dequeue free list (tsdqfl) on page 338. a free list of n objects, each with shape of width = w, height = h and a start address of ? a ? , is created by enqueueing address a, a+h, a+2h, a+3h, ? a+(n-1)h in the free list. to prevent memory bottlenecks, free lists can also be created in a ? sprayed ? manner for objects contained in sdram. for example, when searching a large lpm tree with five pscbs in a single bank, the bank must be accessed five times before reaching a leaf. when the pscbs are sprayed among multiple banks, the number of accesses remains iden- tical but accesses are to multiple banks, thus eliminating bottlenecks. 8.2.6 tse registers and register map table 8-18. free list entry definition field bit length description head 26 free list head address in the control store. tail 26 free list tail address in the control store. qcnt 26 number of entries in the free list. threshold 5 threshold value for the free list control block entry. this field is initialized to 0. the threshold is determined as 2**threshold. when the qcnt is less than or equal to the threshold, a class 0 interrupt (bit 8) is generated. table 8-19. tse scalar registers for gth only (page 1 of 2) name read/ write hex address bit length description color r/w 00 16 color - see section 8.2.1 input key and color register for fm and lpm trees on page 304 and section 8.2.2 input key and color register for smt trees on page 304. lcba0 r/w 02 26 leaf control block address 0 - typically contains the control store address of the leaf in tsr0, but is also used as an address register for various tse commands. lcba1 r/w 03 26 leaf control block address 1 - typically contains the control store address of the leaf in tsr1, but is also used as an address register for various tse commands. dta_addr r/w 04 26 dtentry address - valid after a hash has been performed. dta_shape r/w 05 5 shape of a dtentry - always (1,1) when direct leaves are disabled. equals the leaf shape as defined in ludeftable when direct leaves are enabled. always set to ? 00000 ? . hashedkeylen r/w 06 8 pattern length minus 1 in hashedkey cacheflags r 07 3 see section 8.2.3.3 cascade/cache on page 306 nrpscbs r 08 8 see section 8.2.3.3 cascade/cache on page 306 hashedkey r/w 0a-0f 192 contains hashedkey hashedkey 191_160 r/w 0a 32 bits 191..160 of hashedkey hashedkey 159_128 r/w 0b 32 bits 159..128 of hashedkey hashedkey 127_96 r/w 0c 32 bits 127..96 of hashedkey hashedkey 95_64 r/w 0d 32 bits 95..64 of hashedkey
ibm powernp np4gs3 network processor preliminary tree search engine page 316 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 hashedkey 63_32 r/w 0e 32 bits 63..32 of hashedkey hashedkey 31_0 r/w 0f 32 bits 31..0 of hashedkey ludefcopy r 10-12 96 contains lu_def_table index in use and a copy of the following lu_def_table fields: field bits definition reserved 95:88 x'00'reserved lu_def_index 87:80 index of the location from where the con- tent was read. for remaining field definitions, see table 8-17: ludeftable entry definitions on page 312, with exceptions noted. lu_def_invalid 79 lu_def_state 78:77 reserved 76:75 reserved leaf_height 74:72 leaf_width 71:70 leaf_fq_index 69:64 reserved 63:62 reserved dt_size 61:58 dt_base_addr 57:32 reserved 31:23 pscb_height 22 pscb_fq_index 21:16 reserved 15:14 reserved nla_rope_ena 13 p1p2_max_size 12:8 color_ena 7 hash_type 6:3 search_type 2:1 cascade/cache entry 0 ludefcopy 95_64 r 10 32 bits 95:64 of the ludefcopy register ludefcopy 63_32 r 11 32 bits 63..32 of ludefcopy ludefcopy 31_0 r 12 32 bits 31..0 of ludefcopy setpatbit_gdh r/w 1b 1 contains one bit from the pattern stored in tsrx. set by setpatbit_gdh com- mand. distposreg_gdh r 1a 8 contains result of distpos command from distpos_gdh ludefcopy_gdh r 19 26 contains ludeftable index in use and a copy of the following ludeftable fields: reserved = ludefcopy_gdh(25,24,23), leaf_format_type = ludefcopy_gdh(22,21), p1p2_max_size = ludefcopy_gdh(20 to 16), dt_size= ludefcopy_gdh(15 to 12), nlarope_en = ludefcopy_gdh(11), color_en = ludefcopy_gdh(10), tree_type = ludefcopy_gdh(9,8), ludeftable_index= ludefcopy_gdh(7 to 0) ludefcopy_gdh 31_0 r 19 26 bit 31..0 of ludefcopy_gdh table 8-19. tse scalar registers for gth only (page 2 of 2) name read/ write hex address bit length description
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 317 of 554 table 8-20. tse array registers for all gxh name read/ write starting hex address ending hex address bit length description tsr0 r/w 32 47 2048 tree search result area 0 note: portion of the data overlaps with tsr1 tsr1 r/w 40 47 1024 tree search result area 1 note: portion of the data overlaps with tsr0 tsr2 r/w 48 63 2048 tree search result area 2 note: portion of the data overlaps with tsr3 tsr3 r/w 56 63 1024 tree search result area 3 note: portion of the data overlaps with tsr2 note: in this table, starting and ending address represents the offset for a given thread's starting address in the shared memory pool. table 8-21. tse registers for gth (tree management) name read/ write hex address bit length description patbit_tsr0 r/w 1e 1 contains one bit from the pattern stored in tsr0. set by trs0pat command. distposreg r 1f 8 contains result of distpos command luropecopyth r 13 10 contains copy of leafth field of ludeftable luropecopyqcnt r 14 26 contains copy of leafcnt field of ludeftable luropecopyprev r 15 26 contains copy of ropepla field of ludeftable luropecopycurr r 16 26 contains copy of ropecla field of ludeftable table 8-22. tse scalar registers for gdh and gth name read/ write hex address bit length description lcba0 r/w 02 31 leaf control block address 0 - typically contains the control store address of the leaf in tsrx, but is also used as an address register for various tse commands. bits 30:26 the leaf control block address shape which is used by cmpend instruction only bits 25:0 leaf control block address lcba1 r/w 03 31 leaf control block address 1 - typically contains the control store address of the leaf in tsrx, but is also used as an address register for various tse commands. bits 30:26 the leaf control block address shape which is used by cmpend instruction only bits 25:0 leaf control block address cacheflags r 07 3 see section 8.2.3.3 cascade/cache on page 306 nrpscbs r 08 8 see section 8.2.3.3 cascade/cache on page 306
ibm powernp np4gs3 network processor preliminary tree search engine page 318 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 setpatbit_gdh r/w 1b 1 contains one bit from the pattern stored in tsrx. set by setpatbit_gdh com- mand distposreg_gdh r 1a 8 contains result of distpos command from distpos_gdh ludefcopy_gdh r 19 26 contains ludeftable index in use and a copy of the following ludeftable fields: reserved = ludefcopy_gdh(25,24,23), leaf_format_type = ludefcopy_gdh(22,21), p1p2_max_size = ludefcopy_gdh(20 to 16), dt_size= ludefcopy_gdh(15 to 12), nlarope_en = ludefcopy_gdh(11), color_en = ludefcopy_gdh(10), tree_type = ludefcopy_gdh(9,8), ludeftable_index= ludefcopy_gdh(7 to 0) table 8-23. pscb register format field bit length control store address for pscb npa0 26 next pscb address - pointer to next pscb in tree for pscb part 0 nbt0 8 next bit to test for pscb part 0 lcba0 26 leaf control block address: pointer to leaf for pscb part 0 npa1 26 next pscb address - pointer to next pscb in tree for pscb part 1 nbt1 8 next bit to test for pscb part 1 lcba1 26 leaf control block address - pointer to leaf for pscb part 1 index 8 index of current pscb physically stored in previous pscb patbit 1 value of hashedkey[index] based on value of index field in pscb register table 8-24. tse gth indirect registers indirect register bit length description notes pscbx.npa_hk 26 selects either the npa0 field from pscbx or the npa1 field depending on value of register pscbx.index 1, 2, 3 pscbx.npa_tsr0 26 selects either the npa0 field from pscbx or the npa1 field depending on value of register patbit_tsr0. (register patbit_tsr0 must have been initialized previously using tsr0pat command) 1, 2, 4 pscbx.nbt_hk 8 selects either the nbt0 field from pscbx or the nbt1 field depending on value of register pscbx.index 1, 2, 3 pscbx.nbt_tsr0 8 selects either the nbt0 field from pscbx or the nbt1 field depending on value of register patbit_tsr0 1, 2, 4 pscbx.lcba_hk 26 selects either the lcba0 field from pscbx or the lcba1 field depending on value of register pscbx.index 1, 2, 3 pscbx.lcba_tsr0 26 selects either the lcba0 field from pscbx or the lcba1 field, depending on value of register patbit_tsr0 1, 2, 4 pscbx.notnpa_hk 26 selects either the npa0 field from pscbx or the npa1 field depending on inverse value of register pscbx.index 1, 2, 3 pscbx.notnpa_tsr 0 26 selects either the npa0 field from pscbx or the npa1 field depending on inverse value of register patbit_tsr0 1, 2, 4 table 8-22. tse scalar registers for gdh and gth name read/ write hex address bit length description
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 319 of 554 pscbx.notnbt_hk 8 selects either the nbt0 field from pscbx or the nbt1 field depending on inverse value of register pscbx.index 1, 2, 3 pscbx.notnbt_tsr 0 8 selects either the nbt0 field from pscbx or the nbt1 field depending on inverse value of register patbit_tsr0 1, 2, 4 pscbx.notlcba_hk 26 selects either the lcba0 field from pscbx or the lcba1 field depending on inverse value of register pscbx.index 1, 2, 3 pscbx.notlcba_ts r0 26 selects either the lcba0 field from pscbx or the lcba1 field depending on inverse value of register patbit_tsr0 1, 2, 4 1. x must equal 0, 1, or 2. 2. the indirect registers of the tse select, via dedicated hardware assist, one of the tse registers listed in section 8.2.6 tse regis- ters and register map on page 315. the indirect registers appear in the tse register map with a unique register number. 3. pscbx.index points to a specific bit in the hashedkey. the bit ? s value determines whether the 0 or 1 part of pscbx will be read or written. 4. value of patbit_tsr0 determines whether the 0 or 1 part of pscbx will be read or written. table 8-25. address map for pscb0-2 registers in gth pscbx read/write pscb0 pscb1 pscb2 size npa0 r/w 80 a0 c0 26 nbt0 r/w 81 a1 c1 8 lcba0 r/w 82 a2 c2 26 npa1 r/w 84 a4 c4 26 nbt1 r/w 85 a5 c5 8 lcba1 r/w 86 a6 c6 26 addr r/w 88 a8 c8 26 index r/w 89 a9 c9 8 patbit r 8b ab cb 1 npa_hk r/w 90 b0 d0 26 nbt_hk r/w 91 b1 d1 8 lcba_hk r/w 92 b2 d2 26 notnpa_hk r/w 94 b4 d4 26 notnbt_hk r/w 95 b5 d5 8 notlcba_hk r/w 96 b6 d6 26 npa_tsr0 r/w 98 b8 d8 26 nbt_tsr0 r/w 99 b9 d9 8 lcba_tsr0 r/w 9a ba da 26 notnpa_tsr0 r/w 9c bc dc 26 notnbt_tsr0 r/w 9d bd dd 8 notlcba_tsr0 r/w 9e be de 26 table 8-24. tse gth indirect registers (continued) indirect register bit length description notes
ibm powernp np4gs3 network processor preliminary tree search engine page 320 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7 tse instructions the tree search engine (tse) provides facilities for conducting and managing tree searches. two tse coprocessor interfaces are provided for each thread: tse0 and tse1. therefore, each thread can issue instructions to each of these tse0 and tse1 resources. since two types of instruction set groupings differentiate gdh and gth commands, the assembler uses tse0_ to indicate the gdh commands and tse1_ prefixes to indicate the gth commands. on the gth thread only one gth command can be executed at a time, and a second tse (gdh or gth) command is not permitted. table 8-26. general tse instructions opcode command detail section 0null 1ts_fm 8.2.7.1 fm tree search (ts_fm) on page 321 2 ts_lpm 8.2.7.2 lpm tree search (ts_lpm) on page 322 3ts_smt 8.2.7.3 smt tree search (ts_smt) on page 324 4mrd 8.2.7.4 memory read (mrd) on page 325 5mwr 8.2.7.5 memory write (mwr) on page 326 6hk 8.2.7.6 hash key (hk) on page 327 7 rdludef 8.2.7.7 read ludeftable (rdludef) on page 328 8compend 8.2.7.8 compare-at-end (compend) on page 329 9 distpos_gdh 8.2.7.9 distinguishposition for fast table update (distpos_gdh) on page 330 10 rdpscb_gdh 8.2.7.10 read pscb for fast table update (rdpscb_gdh) on page 332 11 wrpscb_gdh 8.2.7.11 write pscb for fast table update (wrpscb_gdh) on page 333 12 setpatbit_gdh 8.2.7.12 setpatbit_gdh on page 334 13 to15 reserved note: commands can be executed by all gxhs with threads.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 321 of 554 8.2.7.1 fm tree search (ts_fm) figure 8-5. general layout of tse fields in shared memory pool table 8-27. fm tree search input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable that controls the search. lcbanr 1 imm16(0) imm12(0) 0 search results are stored in tsrx/lcba0. 1 search results are stored in tsrx/lcba1. tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address. stores location of key, key- length, and color and determines leaf destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four- qw boundary. use only values of x'8', x'a', x'c', and x'e'. key 192 shared memory pool key. pattern to be searched, located in shared memory pool. must be initialized before search (np4gs3a (r1.1) only). the key must be initialized only in the byte boundary of the key length (np4gs3b (r2.0) only). keylength 8 shared memory pool keylength. length of pattern minus 1 in key. must be initialized before search. located in shared memory pool. color 16 shared memory pool color. used only when enabled in ludeftable. must be initialized before search. located in shared memory pool. following is available for gth only. useludefcopyreg 1 imm16(13) imm12(5) enables tse read of ludefcopy register. can save clock cycles, espe- cially when rdludef is executed asynchronously with the picocode that sets the key. 0 tse reads ludeftable. 1 tse does not read the ludeftable and uses information con- tained in ludefcopy register. assumes ludeftable was read previously using rdludef. ludefcopy 96 register input only when useludefcopyreg is ? 1 ? . leaf userdata pattern (103..0) from leaf leaf userdata pattern (191..104) from leaf prefix nla rope reserved dta shape dta (26 bits) hashedkey(191..104) reserved hashedkey(103..0) hkeylen reserved reserved keylen color key(191 ..96) key (95 ...0) 127 119 111 95 31 0 31 87 23 70 tsedpa tsedpa+1 tsedpa+2 tsedpa+3 tsedpa+4 tsedpa+5 63 127 96 121 (5 bits) tsedpa+6 tsedpa+7
ibm powernp np4gs3 network processor preliminary tree search engine page 322 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.2 lpm tree search (ts_lpm) table 8-28. fm tree search results (tsr) output result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation. 1 ok: successful operation. tsrx 512 shared memory pool when ok/ko is ? 1 ? , leaf is read and stored in tsrx. tsrx is mapped into the shared memory pool, at an offset of 4 qw past the starting qw indicated by the input tsedpa parameter. that is, the shared memory pool qw location = tsedpa*4 + 4 lcba0 / 1 26 register when ok/ko is ? 1 ? , leaf address is stored in lcba0 / 1. cacheflags 3 register see section 8.2.3.3 cascade/cache on page 306. nrpscbs 8 register see section 8.2.3.3 cascade/cache on page 306. ludefcopy_gdh 26 register contains ludeftable index in use and a copy of the following ludeftablefields: reserved ludefcopy_gdh(25,24,23) leaf_format_type ludefcopy_gdh(22,21) p1p2_max_size ludefcopy_gdh(20 to 16) dt_size ludefcopy_gdh(15 to 12) nlarope_en ludefcopy_gdh(11) color_en ludefcopy_gdh(10) tree_type ludefcopy_gdh(9,8) ludeftable_index ludefcopy_gdh(7 to 0) will be stored in scalar register and its address in register address map is ? 019 ? . following is available for gth only. ludefcopy 96 register output only when useludefcopyreg is ? 0 ? . set to contents of ludeftable at entry pointed to by ludefindex. table 8-29. lpm tree search input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable used to control the search lcbanr 1 imm16(0) imm12(0) 0 search results are stored in tsrx/lcba0 1 search results are stored in tsrx/lcba1 tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address. stores location of key, key- length, and color and determines leaf destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four- qw boundary. use only values of x'8', x'a', x'c', and x'e'. key 192 shared memory pool key. pattern to be searched, located in shared memory pool. must be initialized before search (np4gs3a (r1.1) only). the key must be initialized only in the byte boundary of the key length (np4gs3b (r2.0) only). keylength 8 shared memory pool keylength - length of pattern minus 1 in key. must be initialized before search. located in shared memory pool. color 16 shared memory pool color - used only when enabled in ludeftable. must be initialized before search. located in shared memory pool.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 323 of 554 following is available for gth only. useludefcopyreg 1 imm16(13) imm12(5) enables tse read of ludefcopy register can save clock cycles, especially when rdludef is executed asyn- chronously with the picocode that sets the key. 0 tse reads ludeftable 1 tse does not read the ludeftable and uses information con- tained in ludefcopy register. assumes ludeftable was read previously using rdludef. special note: useludefcopyreg = ? 1 ? is not supported for ludefentry with cache_enable. ludefcopy 96 register input only when useludefcopyreg is ? 1 ? . set to contents of ludeftable at entry given by ludefindex. table 8-30. lpm tree search results output result bits source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation tsrx 512 shared memory pool when ok/ko is ? 1 ? , leaf is read and stored in tsrx tsrx is mapped into the shared memory pool, at an offset of 4 qw past the starting qw indicated by the input tsedpa parameter. that is, shared memory pool qw location = tsedpa*4 + 4 lcba0 / 1 26 register when ok/ko is ? 1 ? , leaf address is stored in lcba0 / 1 cacheflags 3 register see section 8.2.3.3 cascade/cache on page 306 nrpscbs 8 register see section 8.2.3.3 cascade/cache on page 306 ludefcopy_gdh 26 register contains ludeftable index in use and a copy of the following ludeftablefields: reservedludefcopy_gdh(25,24,23) leaf_format_typeludefcopy_gdh(22,21) p1p2_max_sizeludefcopy_gdh(20 to 16) dt_size ludefcopy_gdh(15 to 12) nlarope_enludefcopy_gdh(11) color_enludefcopy_gdh(10) tree_typeludefcopy_gdh(9,8) ludeftable_indexludefcopy_gdh(7 to 0) will be stored in scalar register, and its address in register address map is ? 019 ? . the following is available for gth only. ludefcopy 96 register output only when useludefcopyreg is ? 0 ? . set to contents of ludeftable at entry given by ludefindex. table 8-29. lpm tree search input operands operand bit length operand source description direct indirect
ibm powernp np4gs3 network processor preliminary tree search engine page 324 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.3 smt tree search (ts_smt) table 8-31. smt tree search input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable used to control the search lcbanr 1 imm16(0) imm12(0) 0 search results are stored in tsrx/lcba0 1 search results are stored in tsrx/lcba1 tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address - stores location of key, keylength, and color and determines leaf destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. key 192 shared memory pool key. pattern to be searched, located in shared memory pool. must be initialized before search (np4gs3a (r1.1) only). the key must be initialized only in the byte boundary of the key length (np4gs3b (r2.0) only). keylength 8 shared memory pool keylength. length of pattern minus 1 in key. must be initialized before search. located in shared memory pool. color 16 shared memory pool color. used only when enabled in ludeftable. must be initialized before search. located in shared memory pool. following is available for gth only. useludefcopyreg 1 imm16(13) imm12(5) enables tse read of ludefcopy register can save clock cycles, especially when rdludef is executed asyn- chronously with the picocode that sets the key. 0 tse reads ludeftable 1 tse does not read the ludeftable and uses information con- tained in ludefcopy register. assumes ludeftable was read previ- ously using rdludef. special note: useludefcopyreg = ? 1 ? is not supported for ludefentry with cache enable. ludefcopy 96 register input only when useludefcopyreg is ? 1 ? . set to contents of ludeftable at entry given by ludefindex. table 8-32. smt tree search results output result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation tsrx 512 shared memory pool when ok is ? 1 ? , leaf is read and stored in tsrx. tsrx is mapped into the shared memory pool, at an offset of 4 qw past the starting qw indicated by the input tsedpa parameter. that is, shared memory pool qw location = tsedpa*4 + 4 lcba0 / 1 26 register when ok is ? 1 ? , leaf address is stored in lcba0 / 1 cacheflags 3 register see section 8.2.3.3 cascade/cache on page 306 nrpscbs 8 register see section 8.2.3.3 cascade/cache on page 306
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 325 of 554 8.2.7.4 memory read (mrd) the memory read command provides direct read capability from any location in the control store. lcba0 / 1 provide the full read address. shape is provided by the ludeftable (for leaf or pscb) or directly as part of the command field for the object. the content to be read is stored in tsrx. ludefcopy_gdh 26 register contains ludeftable index in use and a copy of the following ludeftablefields: reserved = ludefcopy_gdh(25,24,23), leaf_format_type = ludefcopy_gdh(22,21), p1p2_max_size = ludefcopy_gdh(20 to 16), dt_size = ludefcopy_gdh(15 to 12), nlarope_en = ludefcopy_gdh(11), color_en = ludefcopy_gdh(10), tree_type = ludefcopy_gdh(9,8), ludeftable_index = ludefcopy_gdh(7 to 0) will be stored in scalar register, and its address in register address map is ? 019 ? . following are available for gth only. ludefcopy 96 register an output only when useludefcopyreg is ? 0 ? . set to contents of ludeftable at entry given by ludefindex. table 8-33. memory read input operands operand bit length operand source description direct indirect shapectrl 2 imm16(14..13) gpr(9..8) 00 direct shape 10 pscb shape is based on address and tree type from ludeftable. that is, lpm tree type and zbt or h1 address shape will be set to (1x2), and for any other case, shape will be set to (1x1). 11 leaf shape from ludeftable ludefindex 8 imm16(12..5) gpr(7..0) ludeftable entry used to read shape information. valid only when shapectrl is ? 10 ? or ? 11 ? . width 2 imm16(9..8) gpr(4..3) width of object to be read. valid only when shapectrl is ? 00 ? . height 3 imm16(7..5) gpr(2..0) height of object to be read. valid only when shapectrl is ? 00 ? . tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address determines the destina- tion location for the data to be read. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. lcbanr 1 imm16(0) imm12(0) 0 address to be read is lcba0 1 address to be read is lcba1 lcba0 / 1 26 register address to be read table 8-32. smt tree search results output result bit length source description
ibm powernp np4gs3 network processor preliminary tree search engine page 326 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.5 memory write (mwr) the memory write command provides direct write capability to any location in the control store. lcba0 / 1 provide the full write address. shape is provided by the ludeftable (for leaf or pscb) or directly as part of the command field for the object. the content to be written is stored in tsrx. table 8-34. memory read output results result bit length source description tsrx 512 shared mem- ory pool tsrx is mapped into the shared memory pool, starting at the qw indicated by the input tsedpa parameter for a length of 4 qw. ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-35. memory write input operands operand bit length operand source description direct indirect shapectrl 2 imm16(14..13) gpr(9..8) 00 direct shape 10 pscb shape is based on address and tree type from ludeftable. that is, lpm tree type and zbt or h1 address shape will be set to (1x2), and for any other case, shape will be set to (1x1). 11 leaf shape from ludeftable ludefindex 8 imm16(12..5) gpr(7..0) ludeftable entry used to read shape information. valid only when shapectrl is ? 10 ? or ? 11 ? . width 2 imm16(9..8) gpr(4..3) width of object to be read. valid only when shapectrl is ? 00 ? . height 3 imm16(7..5) gpr(2..0) height of object to be read. valid only when shapectrl is ? 00 ? . tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address determines the source location for the data to be written. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. lcbanr 1 imm16(0) imm12(0) 0 address to be written is lcba0 1 address to be written is lcba1 lcba0 / 1 26 register address to be written tsrx 512 shared memory pool data to be written. tsrx is mapped into the shared memory pool, starting at the qw indicated by the input tsedpa parameter for a length of 4 qw.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 327 of 554 8.2.7.6 hash key (hk) table 8-36. hash key input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable containing hash type tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address - stores location of key, key- length, and color and determines hash key destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. direct_ hashtype_en 1 imm16(0) imm12(0) enable direct hashtype definition 0 hashtype defined by ludefentry 1 hashtype defined via command direct_ hashtype 4 imm16(8..5) gpr(3..0) use defined hash type for hashing. valid when direct_hashtype_en = 1 key 192 shared memory pool key - pattern to be searched, located in shared memory pool. must be initial- ized before search (np4gs3a (r1.1) only). the key must be initialized only in the byte boundary of the key length np4gs3b (r2.0) only). keylen 8 shared memory pool defines length of pattern minus 1 in key. must be initialized before search. located in the shared memory pool. color 16 shared memory pool must be initialized before search - used only when enabled in ludeftable. located in the shared memory pool. invalid when direct_hashtype_en is set to value ? 1 ? . table 8-37. hash key output results (page 1 of 2) result bit length source description hashedkeyreg 192 shared memory pool hashed key register - contains the hashedkey (including color when enabled in ludeftable) according to section 8.2.1 input key and color register for fm and lpm trees on page 304 and section 8.2.2 input key and color register for smt trees on page 304. hash function is defined in the ludeftable. stored in the shared memory pool qw location = tsedpa*4 + 2 & 3. hashedkeylen 8 shared memory pool hashed key length contains the length of pattern minus 1 in hashedkeyreg. stored in the shared memory pool qw location = tsedpa*4 + 3 bit (7:0). dta 26 shared memory pool dtentry address. stored in shared memory pool qw location = tsedpa*4 + 2, bits (121:96). note: not valid when direct_hashtype_en is set to value ? 1 ? . ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 network processor preliminary tree search engine page 328 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.7 read ludeftable (rdludef) rdludef reads ludeftable at a specified entry and stores the result in the ludefcopy register field in the shared memory pool.the tse can read ludeftable while picocode builds a key because rdludef is executed asynchronously. ludefcopy_gdh 26 register contains ludeftable index in use and a copy of the following ludeftablefields: reserved = ludefcopy_gdh(25,24,23), leaf_format_type = ludefcopy_gdh(22,21), p1p2_max_size = ludefcopy_gdh(20 to 16), dt_size = ludefcopy_gdh(15 to 12), nlarope_en = ludefcopy_gdh(11), color_en = ludefcopy_gdh(10), tree_type = ludefcopy_gdh(9,8), ludeftable_index = ludefcopy_gdh(7 to 0) will be stored in scalar register, and its address in register address map is ? 019 ? . the following is available for gth only. ludefcopy 96 register set to contents of ludeftable at entry given by ludefindex. figure 8-6. general layout of tse rdludef in shared memory pool table 8-38. rdludef input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address determines the destination location for the ludeftable entry data to be read. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. table 8-37. hash key output results (page 2 of 2) result bit length source description ludefindex ludeftable entry (79..0)) ludeftable_rope entry (95 ...0) (gth only) 127 89 88 80 0 tsedpa tsedpa+1 79 reserved 95 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 329 of 554 8.2.7.8 compare-at-end (compend) compend performs a compare-at-end operation for smt trees. after a tree search command has performed a full search including a compend, picocode can obtain a pointer to another leaf chain from the leaf and start another compend on the leaf chain. the first leaf chain could contain leaves with filtering information, and the second could contain leaves with quality of service information. compend should be used only after a tree search command. the compend command uses key instead of hashedkey during operation, since it is assumed that hashing is not needed for smt. hence, the key field is read by the command, not the hashedkey field. also, since hashing is not done on the key and it is used ? as is, ? it is necessary to initialize the unused key bits to zero. table 8-39. rdludef output results result bit length source description ludefcopy 88 shared memory pool set to contents of ludeftable at entry given by ludefindex and stored in shared memory pool. the entry is placed into two qw starting at the qw indicated by the tsedpa. the entry is right-justified with the most significant bits padded with 0. various fields of ludefentry are at the same bit position as in the ludeftable mem- ory. (79 to 0) bits represent the ludefentry contents and bits (86 to 80) represents the cor- responding ludefindex. they are right-justified. note: tsedpa address and tsedpa + 1 will be overwritten during this command. contents of ludefcopy will be returned at tsedpa and contents of ludefcopy_rope will be returned in the next address. ludefcopy_rope content will be valid for the gth thread. ludefcopy_gdh 26 register contains ludeftable index in use and a copy of the following ludeftablefields: reserved = ludefcopy_gdh(25,24,23), leaf_format_type = ludefcopy_gdh(22,21), p1p2_max_size = ludefcopy_gdh(20 to 16), dt_size = ludefcopy_gdh(15 to 12), nlarope_en = ludefcopy_gdh(11), color_en = ludefcopy_gdh(10), tree_type = ludefcopy_gdh(9,8), ludeftable_index = ludefcopy_gdh(7 to 0) will be stored in scalar register and its address in register address map is ? 019 ? . ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-40. compend input operands (page 1 of 2) operand bit length operand source description direct indirect ludefindex 8 imm16(12:5) gpr(7:0) defines the entry in the ludeftable. lcbnanr 1 imm16(0) imm12(0) 0 search results are stored in tsrx/lcba0 1 search results are stored in tsrx/lcba1 tsedpa 4 imm16(4..1) imm12(4..1) tse thread shared memory pool address - stores location of key, keylength, and color and determines leaf destination. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x ? 8 ? ,x ? a ? ,x ? c ? , and x ? e ? .
ibm powernp np4gs3 network processor preliminary tree search engine page 330 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.9 distinguishposition for fast table update (distpos_gdh) distpos_gdh performs a pattern compare between the pattern stored in hashedkey of the shared memory pool and a pattern from the leaf specified in tsrx. the result is stored in the distposreg_gdh register. the ok flag is set when a full match has been detected. lcba0 / 1 31 register start address of leaf chain and its shape. bits: 30:29 leaf width 28:26 leaf height 25:0 leaf address note: a valid leaf width and leaf height must be provided with the leaf address. key 192 shared memory pool input of previous tree search command located in the shared memory pool at an offset of 0 and 1qw from the qw indicated by the tsedpa. keylen 8 shared memory pool input of previous tree search command located in the shared memory pool at the qw indicated by the tsedpa table 8-41. compend output results result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation tsrx 512 shared mem- ory pool when ok is ? 1 ? , leaf is read and stored in tsrx when ok is ? 0 ? , last leaf of chain is stored in tsrx tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa parameter for a length of 4 qw. lcba0 / 1 26 register when ok is ? 1 ? , leaf address is stored in lcba0 / 1 when ok is ? 0 ? , leaf address of last leaf in chain is stored in lcba0 / 1 note: shape for the corresponding leaf will not be valid when lcaba0/1 is read. table 8-40. compend input operands (page 2 of 2) operand bit length operand source description direct indirect
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 331 of 554 figure 8-7. shared memory pool with distpos_gdh command subfields table 8-42. distpos_gdh input operands operand bit length operand source description direct indirect tsedpa_leaf 4 imm16(4..1) imm12(4..1) indicates the leaf storage location in the thread shared memory pool. the tsedpa_leaf is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x ? 8 ? ,x ? a ? , x ? c ? , and x ? e ? . tsedpa_hk 4 imm16(8..5) imm12(8..5) indicates the hashedkey storage location in the thread shared memory pool. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.10 shared memory pool) and is constrained to be on a four-qw boundary. use only values of x ? 8 ? ,x ? a ? ,x ? c ? , and x ? e ? . hashedkey is assumed to be stored at tsedpa +2 location in shared memory pool. hashedkey 192 shared memory pool contains the hashed pattern and it will be stored at tsedpa_hk +2 (7..0) quad word location. the structure of the hashed key in shared memory pool is shown in figure 8-7 . hashedkeylen 8 shared memory pool contains length of hashed pattern minus 1. stored at tsedpa_hk +2 quad word location. structure of hashed key in shared memory pool is shown in figure 8- 7 . tsrx 512 shared memory pool contains second pattern with pattern length (for lpm & fm only). tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa_leaf parameter for a length of 4 qw. leaf userdata pattern (103..0) from leaf leaf userdata pattern (191..104) from leaf prefix nla rope reserved dta shape dta (26 bits) hashedkey (191..104) reserved hashedkey (103..0) hkeylen reserved reserved keylen color key (191 ..96) key (95 ...0) 127 119 111 95 31 0 31 87 23 70 tsedpa_hk tsedpa_hk+1 tsedpa_hk+2 tsedpa_hk+3 tsedpa_leaf tsedpa_leaf+1 63 127 96 121 (5 bits)
ibm powernp np4gs3 network processor preliminary tree search engine page 332 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 distpos_gdh applies to lpm and fm only. it needs p1p2_max_size value as a part of the computation. the value is assumed to be valid prior to this command. this value is stored as a scalar register, and is inde- pendent for each thread. 8.2.7.10 read pscb for fast table update (rdpscb_gdh) rdpscb_gdh reads a pscb from the control store and stores it in one of the pscbx locations in the shared memory pool. for lpm, the entire pscb is read from memory at the address given by pscb.addr and stored in pscb. for fm, the entire pscb is read from memory and converted into a lpm pscb format and either npa0/1 or lcba0/1 will be set to all zero, since only one of the fields will be valid. this command is not supported for smt. structure for pscb register shared memory pool is as follows: table 8-43. distpos_gdh output results result bit length source description ok/ko 1 flag 0 pattern does not match 1 pattern in hashedkey matches pattern in tsrx distposreg_gdh 8 scalar register smallest bit position where pattern in hashedkey differs from pattern in tsrx. will be stored in scalar register and its address in the register address map is ? 01a ? . ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation figure 8-8. shared memory pool with pscb subfields lcba1 address reserved index npa0 reserved 127 87 95 31 0 pscbx dpa 63 dpa+1 reserved 79 patbit 72 71 lcba0 npa1 np1 np0
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 333 of 554 8.2.7.11 write pscb for fast table update (wrpscb_gdh) wrpscb_gdh writes a pscb stored in pscbx (in shared memory pool) to the control store. for lpm, the entire pscb is written to memory to the address given by pscb.addr. for fm, the pscb is written to mem- ory in fmpscb format. in both cases, wrpscb_gdh generates the pscb ? s two format bits. when the pscb represents a dtentry, only half of pscb is written to memory. that is, the entire dtentry is written by picocode. this command is not supported for smt. pscb_cntl/format bits (which defines end node or valid presence of leaf) for the dt/pscb entry will be computed based on the content of pscb.lcba0/1 and no checking/validation is performed, based on the pscb.npa0/1 or pscb.nbt0/1, therefore pscb.nbt0/1 will be written to memory as is and for lpm pscb.npa0/1 is written as is. table 8-44. rdpscb_gdh input operands operand bit length operand source description direct indirect dpa_pscb 6 imm16(5..0) imm12(5..0) the dpa_pscb is a full address of the 7.2.4 shared memory pool on page 193. this indicates working address of the pscb register. full pscb will occupy 2 quadwords from shared memory pool. this address must start at an even address. rdwrpscb_cntl 2 imm16(7..6) imm12(7..6) 00 result is based on patbit value in the shared memory pool as shown in above diagram. patbit = 0 branch 0 entry of the pscb is read patbit = 1 branch 1 entry of the pscb is read 01 branch 0 entry of the pscb is read 10 branch 1 entry of the pscb is read 11 branch 0 and branch 1 entry of the pscb is read rdwrpsc_dt_flag 1 imm16(8) imm12(8) indicates entry is dt. note: not valid for np4gs3b (r2.0). pscb.addr 26 shared memory pool (121..96) address to be written in control store of the pscb. this is stored in shared memory pool. table 8-45. rdpscb_gdh output results result bit length source description pscb - - register pscb is read from control store and stored in register pscb. npa0, nbt0, lcba0 and npa1, nbt1, lcba1 fields are changed. remainders are not changed. for fm: content of either npa0 or lcba0 and npa1 or lcba1 will be set to zero i.e either npa and lcba will be valid. also content of nbt0 will be set to zero when npa0 is zero and nbt1 will be set to zero when npa1 is zero. i.e for end node nbt and npa is not valid. for smt: content of npa0/1, nbt0/1 and lcba0/1 will be based on the actual conten read from the memory. logic will not zero out any part of the data. ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 network processor preliminary tree search engine page 334 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.7.12 setpatbit_gdh setpatbit_gdh instruction reads a bit from the hashedkey pattern or leaf reference pattern stored in tsrx of the shared memory pool and stores this bit in the setpatbit_gdh register. picocode must copy it to the appropriate patbit field of tsedpa_pscbx of the shared memory pool. table 8-46. wrpscb_gdh input operands operand bit length operand source description direct indirect dpa_pscb 6 imm16(5..0) imm12(5..0) the dpa_pscb is a full address of the 7.2.4 shared memory pool on page 193. this indicates that the work- ing address of the pscb register full pscb will occupy 2 quadwords from the shared memory pool. this address must start at an even address. rdwrpscb_cntl 2 imm16(7..6) imm12(7..6) 00 action is based on patbit value. patbit = 0 branch 0 entry of the pscb is read patbit = 1 branch 1 entry of the pscb is read 01 branch 0 entry of the pscb is read 10 branch 1 entry of the pscb is read 11 branch 0 and branch 1 entry of the pscb is read rdwrpsc_dt_flag 1 imm16(8) imm12(8) indicates entry is dt. note: not valid for np4gs3b (r2.0). pscb.addr 26 shared memory pool (121..96) address to be written in control store of the pscb. this is stored in shared memory pool. table 8-47. wrpscb_gdh output results result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 335 of 554 setpatbit_gdh applies to lpm only, and requires the p1p2_max_size value as a part of its computation. this value is assumed to be valid prior to this command. the value is stored as a scalar register, and is inde- pendent for each thread. table 8-48. setpatbit_gdh input operands operand bit length operand source description direct indirect dpa_pscb 6 imm16(5..0) imm12(5..0) the dpa_pscb is a full address of the thread's shared address (see 7.2.4 shared memory pool on page 193). this indicates the working address of the pscb register. the dpa_pscbx.index field will be used. tsedpa_hk_leaf 4 imm16(9..6) imm12(9..6) indicates the leaf or hashedkey storage location in the thread shared memory pool. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only val- ues of x ? 8 ? ,x ? a ? ,x ? c ? ,and x ? e ? . when hk_leaf_flag = 0, hashedkey will be used for this operation. when hk_leaf_flag = 1, leaf content will be used for operation. when leaf is to be used, location for leaf will be tsedpa_hk_leaf; and when hashedkey is used, loca- tion for hashedkey will be tsedpa_hk_leaf +2. hk_leaf_flag 1 imm16(10) imm12(10) when hk_leaf_flag = 0, hashedkey will be used for this operation and tsedpa_hk_leaf +2 will be used as a hashedkey location in shared memory pool. when hk_leaf_flag = 1, leaf content will be used for operation and tsedpa_hk_leaf will be used as a leaf location in shared memory pool. hashedkey 192 shared memory pool contains the hashed pattern, and it will be stored at tsedpa_hk_leaf +2 & +3 qw location. structure of hashed key in shared memory pool is shown in the above figure. hashedkeylen 8 shared memory pool contains length of hashed pattern minus 1, and it will be stored at tsedpa_hk_leaf +3 (7..0) qw location. struc- ture of hashed key in shared memory pool is shown in the above figure. tsrx 512 shared memory pool contains second pattern with pattern length (for lpm only). tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa_hk_leaf parameter, for a length of 4 qw. table 8-49. setpatbit_gdh output results result bit length source description setpatbit_gdh 1 register smallest bit position where pattern in hashedkey differs from pattern in leaf at tsedpa_hk_leaf location. its address in register address map is ? 01b ? . ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 network processor preliminary tree search engine page 336 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.8 gth hardware assist instructions note: gth hardware assist instructions will not be supported in the future. 8.2.8.1 hash key gth (hk_gth) table 8-50. general gth instructions opcode command detail section 16 hk_gth 8.2.8.1 hash key gth (hk_gth) on page 336 17 rdludef_gth 8.2.8.2 read ludeftable gth (rdludef gth) on page 337 18 tsenqfl 8.2.8.3 tree search enqueue free list (tsenqfl) on page 338 19 tsdqfl 8.2.8.4 tree search dequeue free list (tsdqfl) on page 338 20 rclr 8.2.8.5 read current leaf from rope (rclr) on page 339 21 ardl 8.2.8.6 advance rope with optional delete leaf (ardl) on page 340 22 tlir 8.2.8.7 tree leaf insert rope (tlir) on page 340 23 reserved 24 clrpscb 8.2.8.8 clear pscb (clrpscb) on page 341 25 rdpscb 8.2.8.9 read pscb (rdpscb) on page 341 26 wrpscb 8.2.8.10 write pscb (wrpscb) on page 342 27 pushpscb 8.2.8.11 push pscb (pushpscb) on page 343 28 distpos 8.2.8.12 distinguish (distpos) on page 343 29 tsr0pat 8.2.8.13 tsr0 pattern (tsr0pat) on page 344 30 pat2dta 8.2.8.14 pattern 2dta (pat2dta) on page 344 31 reserved note: the instructions listed in table 8-26: general tse instructions on page 320 can only be executed by the gth table 8-51. hash key gth input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable containing hash type tsedpa 4 imm16(4..1) imm12(4..1) the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. direct_ hashtype_en 1 imm16(0) imm12(0) enable direct hashtype definition 0 hashtype defined by ludefentry 1 hashtype defined via command direct_ hashtype 4 imm16(8..5) gpr(3..0) use defined hash type for hashing. valid when direct_hashtype_en = 1
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 337 of 554 8.2.8.2 read ludeftable gth (rdludef gth) rdludef reads the ludeftable at a specified entry and stores the result in the ludefcopy register. the tse can read ludeftable while picocode builds a key because rdludef is executed asynchronously. once the key is ready, the tree search execution can be executed with the useludefcopyreg flag set to ? 1 ? . key 192 shared memory pool provides pattern to be searched. must be initialized before search. located in the shared memory pool. keylen 8 shared memory pool defines length of pattern minus 1 in key. must be initialized before search. located in the shared memory pool. color 16 shared memory pool must be initialized before search - used only when enabled in ludeftable. located in the shared memory pool. invalid when direct_hashtype_en is set to value ? 1 ? . table 8-52. hash key gth output results result bit length source description hashedkeyreg 192 register contains the hashedkey, including color when color is enabled in ludeftable, according to section 8.2.1 input key and color register for fm and lpm trees on page 304 and section 8.2.2 input key and color register for smt trees on page 304. hash function is defined in ludeftable. hashed key is not stored in shared memory pool. hashedkeylen 8 register contains length of pattern minus 1 in hashedkeyreg. hashed key is not stored in shared memory pool. dta 26 register dtentry address (hashed key is not stored in shared memory pool.) ludefcopy 96 register set to contents of ludeftable at entry given by ludefindex. note: valid for gth only. ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-53. rdludef_gth input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable table 8-54. rdludef_gth output results result bit length source description ludefcopy 96 register scalar register contains content of ludeftable at the entry. no content will be written back to shared memory pool. ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-51. hash key gth input operands operand bit length operand source description direct indirect
ibm powernp np4gs3 network processor preliminary tree search engine page 338 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.8.3 tree search enqueue free list (tsenqfl) tsenqfl releases a control block such as a leaf or pscb to a free list. the address of the memory location to be freed is stored in lcba0 / 1. the leaf or pscb index to the free list is provided by the ludeftable or directly by the command line. the enqueue operation always adds an address to the bottom of a free list fifo-style. entries cannot be added or removed from the middle of a free list. when freelistctrl = 11, tsenqfl increments the leafcount field in ludeftable. 8.2.8.4 tree search dequeue free list (tsdqfl) tsdqfl dequeues an address from a given fqlinklist. the address that has been dequeued from the free list is stored in lcba0 / 1. the leaf or pscb index to the free list is provided by the ludeftable or directly by the command line. when freelistctrl is ? 11 ? , tsdqfl decrements the leafcount field in ludeftable. table 8-55. tsenqfl input operands operand bit length operand source description direct indirect freelistctrl 2 imm16(14..13) gpr(9..8) 00 direct free list index 10 pscb free list index from ludeftable 11 leaf free list index from ludeftable ludefindex/ freelistindex 8 imm16(12..5) gpr(7..0) defines the entry in the ludeftable used to read free list index infor- mation or directly defines freelistindex. srctype 3 imm16(2..0) imm12(2..0) 000 lcba0 001 lcba1 100 pscb0.addr (for gch only) 101 pscb1.addr (for gch only) 110 pscb2.addr (for gch only) lcba0 / 1 pscb0 / 1 / 2.addr 26 register the address to be freed or enqueued table 8-56. tsenqfl output results result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-57. tsdqfl input operands operand bit length operand source description direct indirect freelistctrl 2 imm16(14..13) gpr(9..8) 00 direct free list index 10 pscb free list index from ludeftable 11 leaf free list index from ludeftable ludefindex/ freelistindex 8 imm16(12..5) gpr(7..0) defines the entry in ludeftable used to read free list index informa- tion or directly defines freelistindex. tgttype 3 imm16(2..0) imm12(2..0) 000 lcba0 001 lcba1 100 pscb0.addr (for gch only) 101 pscb1.addr (for gch only) 110 pscb2.addr (for gch only)
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 339 of 554 8.2.8.5 read current leaf from rope (rclr) rclr is used to ? walk the rope ? for such processes as aging. rclr reads the leaf at the current leaf address defined in ludeftable. it stores the leaf address in lcba0 and the leaf contents in tsr0. when rope walking begins, the tse invokes rclr and picocode saves the leaf address (lcba0) in a gpr. before reading the next leaf, the tse invokes the advance rope with optional delete leaf command (ardl), after which rclr can be invoked again. to determine whether a rope walk has ended, picocode compares lcba0 with the gpr to verify whether the leaf that was read is the same as the first leaf. if the leaf is the same, the rope walk has ended. at any time during the rope walk, picocode can delete a leaf from the rope using ardl with the deleteleaf flag set to 1. this is useful when the leaf is to be aged out. rclr can automatically delete leaves from the rope when the deletepending bit in the leaf is set. when this feature is enabled, the tse deletes a leaf and reads the next leaf, which is also deleted when the deletepending bit is set. the process is repeated to delete multiple leaves. after rclr has executed with ok = 1, the contents of tsr0 and lcba0 correspond with cla in the ludeftable, and the previous leaf on the rope has an address of pla. ok = 0, or ko, means the rope is empty. table 8-58. tsdqfl output results result bit length source description lcbao/1 pscb0 / 1 / 2.addr 26 register dequeued address ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-59. rclr input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable used to read rope reserved 3 imm16(3..1) imm12(3..1) must be set to ? 000 ? delenable 1 imm16(0) imm12(0) when ? 1 ? , leaves with deletepending bit are automatically deleted from rope and leaf address will be enqueued in leaf free queue tsedpa 4 imm16(4..1) imm12(4..1) location for the leaf to be stored. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared mem- ory pool on page 193) and is constrained to be on a four-qw bound- ary. table 8-60. rclr output results result bit length source description ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation lcba0 26 register address of leaf that has been read tsrx 512 shared memory pool leaf content is stored in tsrx. tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa parameter for a length of 4 qw.
ibm powernp np4gs3 network processor preliminary tree search engine page 340 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.8.6 advance rope with optional delete leaf (ardl) ardl advances the rope, or updates cla and pla in ludeftable. the nla field from the leaf already stored in tsr0 is read and stored in cla. pla is then updated to the previous value of cla unless the leaf is deleted. in this case, pla remains the same and the nlarope field for the leaf with the current pla address is set to the new cla. when leaf deletion is enabled, the current leaf is deleted prior to advancing the rope. the contents of tsr0 and lcba0 can be destroyed because ardl uses tsr0 and lcba0 as work areas to update the nlarope field. after ardl is executed, cla and pla (in the ludeftable) are updated and rclr (described in the next section) can be executed again. 8.2.8.7 tree leaf insert rope (tlir) tlir inserts a leaf into the rope. the leaf must already be stored in tsr0 (done automatically by picocode during tlir) and the leaf address must already be available in lcba0. tlir maintains the rope, which involves updating the pva field in ludeftable, the nlarope leaf field in control store, and the nlarope leaf field stored in tsrx. the leaf is inserted into the rope ahead of the current leaf, which has address cla. field pla is updated to the new leaf address, which is lcba0 / 1, in ludeftable. following tlir execution, pico- code must invoke mwr to write the leaf into control store. the contents of tsr1 and lcba1 can be destroyed because tlir uses tsr1 and lcba1 as a work area. table 8-61. ardl input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable deleteleaf 1 imm16(0) imm12(0) enable deletion of current leaf from rope. 0 do not delete leaf. 1 delete leaf tsedpa 4 imm16(4..1) imm12(4..1) location of the current leaf address. the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. tsrx 26 register contents of current leaf (address cla in ludeftable). tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa parameter for a length of 4 qw. table 8-62. ardl output results result bit length source description tsrx 512 shared memory pool contents of tsrx will not be destroyed lcba0 26 register contents of lcba0 have been destroyed ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 341 of 554 8.2.8.8 clear pscb (clrpscb) this command writes all zeros to pscb0 / 1 / 2. 8.2.8.9 read pscb (rdpscb) rdpscb reads a pscb from the control store and stores it in one of the pscb0 / 1 / 2 registers. for lpm, the entire pscb is read from memory at the address given by pscb.addr and stored in pscb. for fm, the entire pscb is read from memory and converted into a lpm pscb format. table 8-63. tlir input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable lcba0 26 register address of leaf to be inserted into rope tsedpa 4 imm16(4..1) imm12(4..1) the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. only values of x'8', x'a', x'c', and x'e' should be used. tsrx 26 shared memory pool contents of leaf to be inserted into rope. tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa parameter for a length of 4 qw. table 8-64. tlir output results result bit length source description tsrx 512 shared memory pool leaf nla field has been updated ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-65. clrpscb input operands operand bit length operand source description direct indirect pn 2 imm16(1..0) -- selects pscb0, pscb1, or pscb2 register table 8-66. clrpscb output results result bit length source description pscb -- register set to all zeros ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 network processor preliminary tree search engine page 342 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.8.10 write pscb (wrpscb) wrpscb writes a pscb stored in pscb0, 1, or 2 to the control store. for lpm, the entire pscb is written to memory to the address given by pscb.addr. for fm, the pscb is written to memory in fm pscb format. an error flag is set when the pscb is not in fm pscb format. in both cases, wrpscb generates the pscb ? s two format bits. when the pscb represents a dtentry, only half of pscb is written to memory. that is, the entire dtentry is written. table 8-67. rdpscb input operands operand bit length operand source description direct indirect pn 2 imm16(1..0) imm12(1..0) selects pscb0, pscb1, or pscb2 register pscb.addr 26 register address in control store of pscb to be read rdwrpscb_cntl 2 imm16(3..2) imm12(3..2) 00 result is based on patbit value. patbit = 0branch 0 entry of the pscb is read patbit = 1branch 1 entry of the pscb is read 01 branch 0 entry of the pscb is read 10 branch 1 entry of the pscb is read 11 branch 0 and branch 1 entry of the pscb is read rdwrpsc_ dt_flag 1 imm16(4) imm12(4) indicates entry is dt table 8-68. rdpscb output results result bit length source description pscb -- register pscb is read from control store and stored in register pscb. npa0, nbt0, lcba0 and npa1, nbt1, lcba1 fields are changed. remainders are not changed. ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-69. wrpscb input operands operand bit length operand source description direct indirect pn 2 imm16(1..0) imm12(1..0) selects pscb0, pscb1, or pscb2 register pscb.addr 26 register address in control store of the pscb to be written rdwrpscb_cntl 2 imm16(3..2) imm12(3..2) 00 action is based on patbit value. patbit = 0branch 0 entry of the pscb is written patbit = 1branch 1 entry of the pscb is written 01 branch 0 entry of the pscb is written 10 branch 1 entry of the pscb is written 11 branch 0 and branch 1 entry of the pscb is written rdwrpsc_dt_flag 1 imm16(4) imm12(4) indicates entry is dt
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 343 of 554 8.2.8.11 push pscb (pushpscb) pushpscb pushes the pscb stack. 8.2.8.12 distinguish (distpos) distpos performs a pattern compare between the patterns stored in hashedkey and tsr0. the result is stored in the distposreg register. the ok flag is set when a full match has been detected. table 8-70. pushpscb input operands operand bit length operand source description direct indirect pscb0 -- -- contains a pscb pscb1 -- -- contains a pscb table 8-71. pushpscb output results result bit length source description pscb2 -- register set to pscb1 pscb1 -- register set to pscb0 pscb0 -- register set to all zeros ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-72. distpos input operands operand bit length operand source description direct indirect hashedkey 192 register contains hashed pattern hashedkeylen 8 register contains length of hashed pattern minus 1 tsedpa 4 imm16(4..1) imm12(4..1) the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. tsrx 512 shared memory pool contains second pattern with pattern length (for lpm only). tsrx is mapped into the shared memory pool, starting at an offset of 4 qw past the qw indicated by the input tsedpa parameter for a length of 4 qw. table 8-73. distpos output results result bit length source description ok/ko 1 flag 0 pattern does not match 1 pattern in hashedkey matches pattern in tsrx distpos 8 register smallest bit position where pattern in hashedkey differs from pattern in tsrx ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 network processor preliminary tree search engine page 344 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 8.2.8.13 tsr0 pattern (tsr0pat) tsr0pat reads a bit from the pattern stored in tsrx and stores this bit in the patbit_tsr0 register. 8.2.8.14 pattern 2dta (pat2dta) pat2dta reads a pattern from tsrx, stores it in the hashedkey, and sets the dta register accordingly. pat2dta does not perform a hash since the pattern in a leaf is already hashed. pattern read from tsrx is assumed to be already hashed. table 8-74. tsr0pat input operands operand bit length operand source description direct indirect bitnum 8 -- gpr(7..0) selects bit in tsr0 pattern tsedpa 4 imm16(4..1) imm12(4..1) the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. table 8-75. tsr0pat output results result bit length source description patbit_tsr0 1 register set to value of bit bitnum of pattern stored in tsr0 ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation table 8-76. pat2dta input operands operand bit length operand source description direct indirect ludefindex 8 imm16(12..5) gpr(7..0) defines entry in ludeftable used to calculate dta from hashedkey tsedpa 4 imm16(4..1) imm12(4..1) the tsedpa is the high order 4 bits of the thread's shared memory pool address (see 7.2.4 shared memory pool on page 193) and is constrained to be on a four-qw boundary. use only values of x'8', x'a', x'c', and x'e'. table 8-77. pat2dta output results result bit length source description dta 26 register dtentry address corresponding to dt definition in ludeftable and hashedkey ok/ko 1 flag 0 ko: unsuccessful operation 1 ok: successful operation
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 345 of 554 8.2.9 hash functions in the following figures, the input is always the 192-bit key and the output is a 192-bit hashed output before color insertion. the hash_type field of the ludeftable defines the hasher to be used. if color is enabled, the color is inserted at the bit position given by dt_size in the ludeftable and 16 lsbs of the key are ignored, since a maximum key length of 176 bits is supported when color is enabled. table 8-78. general hash functions hash_type name description 0 no hash no hash is performed and hashed output h(191..0) equals input key k(191..0). can be used for smt trees or lpm trees if 32-bit ip lpm hash cannot be used. 1 192-bit ip hash uses four copies of ip hash box. see figure 8-10: 192-bit ip hash function on page 346. 2 192-bit mac hash see figure 8-11: mac hash function on page 347. 3 192-bit network distpos see figure 8-12: network dispatcher hash function on page 348. 4 reserved 5 48-bit mac swap see figure 8-13: 48-bit mac hash function on page 349. 6 60-bit mac swap see figure 8-14: 60-bit mac hash function on page 350. 7 reserved 8 8-bit hasher see figure 8-15: 8-bit hash function on page 351. 9 12-bit hasher see figure 8-16: 12-bit hash function on page 352. 10 16-bit hasher see figure 8-17: 16 bit hash function on page 353. 11-15 reserved reserved. figure 8-9. no-hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0
ibm powernp np4gs3 network processor preliminary tree search engine page 346 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 figure 8-10. 192-bit ip hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 ip hash box ip hash box ip hash box ip hash box ip hash box ip hash box a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 347 of 554 figure 8-11. mac hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 ip hash box ip hash box ip hash box ip hash box ip hash box ip hash box a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0
ibm powernp np4gs3 network processor preliminary tree search engine page 348 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 figure 8-12. network dispatcher hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 ip hash box ip hash box ip hash box ip hash box ip hash box ip hash box a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 32-bit bitwise swap o(i) i(31-i)
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 349 of 554 figure 8-13. 48-bit mac hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 ip hash box ip hash box ip hash box ip hash box ip hash box ip hash box a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 48-bit bitwise swap o(i) i(47-i)
ibm powernp np4gs3 network processor preliminary tree search engine page 350 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 figure 8-14. 60-bit mac hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 ip hash box ip hash box ip hash box ip hash box ip hash box ip hash box a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 60-bit bitwise swap o(i) i(59-i) 84 4
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 351 of 554 figure 8-15. 8-bit hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 ibm np 8 8 8 ip 192 bit hasher note : effective hashed key will be longer by 8 bits and maximum key length allowed is 184 bits with no color.
ibm powernp np4gs3 network processor preliminary tree search engine page 352 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001 figure 8-16. 12-bit hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 ibm np 12 12 12 ip 192 bit hasher d5 = 00000000 c5 = xxxx0000 note : effective hashed key will be longer by 12 bits and maximum key length allowed is 180 bits with no color.
ibm powernp np4gs3 preliminary network processor np3_dl_sec08_tree.fm.08 may 18, 2001 tree search engine page 353 of 554 figure 8-17. 16 bit hash function b0 a0 k0 (bits 191..160) d0 c0 b4 a4 k4 (bits 63..32) d4 c4 b5 a5 k5 (bits 31..0) d5 c5 b1 a1 k1 (bits 159..128) d1 c1 b2 a2 k2 (bits 127..96) d2 c2 b3 a3 k3 (bits 95..64) d3 c3 a0 h0 (bits 191..160) d0 c0 b4 a4 h4 (bits 63..32) d4 c4 b5 a5 h5 (bits 31..0) d5 c5 b1 a1 h1 (bits 159..128) d1 c1 b2 a2 h2 (bits 127..96) d2 c2 b3 a3 h3 (bits 95..64) d3 c3 b0 ibm np ip 192 bit hasher d5 = 00000000 c5 = 00000000 b0 a0 d0 c0 b4 a4 d4 c4 b5 a5 b1 a1 d1 c1 b2 a2 d2 c2 b3 a3 d3 c3 note : effective hashed key will be longer by 16 bits and maximum key length allowed is 176 bits with no color. 8 8
ibm powernp np4gs3 network processor preliminary tree search engine page 354 of 554 np3_dl_sec08_tree.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec09_spm.fm.08 may 18, 2001 serial / parallel manager interface page 355 of 554 9. serial / parallel manager interface located within the embedded processor complex (epc), the serial / parallel manager (spm) interface is a serial interface for communication with external devices. the spm interface consists of a clock signal output, a bi-directional data signal, and an interrupt input. on this interface, the np4gs3 is the master and the external spm module is the only slave 1 . the spm interface loads picocode, allowing management of phys- ical layer devices (phys) and access to card-based functions such as light-emitting diodes (leds). the spm interface supports: an external spm module boot code load via external spm and eeprom boot override via cabwatch interface or boot_picocode configuration device i/o access to external phys, leds, management, and card-based functions 9.1 spm interface components figure 9-1 shows the functional blocks of the spm interface. the list following the figure describes them. 1.ibm does not supply the external spm module. figure 9-1. spm interface block diagram boot state machine starts up after reset if configured to do so by an external i/o pin (boot_picocode set to 0). it selects one of two boot images (picocode loads) based on a configura- tion flag found in the eeprom and places the code into the instruction memory in the epc. once the code is loaded, the boot state machine causes an interrupt that starts up the guided frame handler (gfh) thread, which executes the loaded code. cab interface a memory mapped interface to the np4gs3 that allows the protocol processors to access any external device, including an spm module, external phys, or card leds. parallel to serial control converts between the internal 32-bit read/write parallel interface and the 3-bit external bi-directional serial interface. instruction memory cab boot state machine cab interface parallel to serial control external spm module
ibm powernp np4gs3 network processor preliminary serial / parallel manager interface page 356 of 554 np3_dl_sec09_spm.fm.08 may 18, 2001 9.2 spm interface data flow the spm interface is used initially to boot the np4gs3. following a reset, the spm interface reads an external eeprom and loads the eeprom ? s contents into the epc ? s instruction memory. when loading is complete, the spm interface causes a por interrupt that causes the guided frame handler (gfh) to start executing the boot code. the boot code initializes the network processor ? s internal structures and configures all the interfaces that the network processor requires to operate. when all boot processing is complete, the gfh activates the operational signal to indicate its availability to the control point function (cpf). the cpf sends guided frames to further initialize and configure the network processor, preparing it for network opera- tion. the boot state machine supports two images of boot code in external eeprom (see figure 9-2 ). the contents of byte x'0 0000' in eeprom is examined during the read process to determine which image to load. when the most significant bit of this byte is a ? 0 ? , the current image resides at addresses x ? 00001 ? - x ? 0 4000 ? . otherwise, the current image resides at addresses x ? 10001 ? -x ? 1 4000 ? . the boot state machine will load the appropriate image and allow the other image area to be used for boot code updates.
ibm powernp np4gs3 preliminary network processor np3_dl_sec09_spm.fm.08 may 18, 2001 serial / parallel manager interface page 357 of 554 the spm interface is also used during initialization and statistics gathering. it interfaces with the ethernet phys, card leds, and other card-level structures through an external spm interface module supplied by the customer. these external structures are mapped to the control access bus (cab) address space and are accessed from the picocode using the same methods as those used to access any internal data structures: the picocode issues reads or writes to the appropriate address and the spm interface converts these reads and writes to serial communications with the external spm module. the external spm module re-converts these serial communications to register reads and writes and to access the desired card device. through the spm interface, the picocode has indirect access to all card-level functions and can configure or gather statis- tics from these devices as if they were directly attached to the cab interface. figure 9-2. epc boot image in external eeprom boot flag dasl picocode x ? 0 0000 ? x ? 0 0001 ? x ? 0 4000 ? boot code & post & boot gfh 70 x ? 0 4001 ? 16 k 8k x ? 0 6000 ? x ? 0 6001 ? post overflow & extended post 16 k x ? 0 a000 ? x ? 0 a001 ? other 24 k not used dasl picocode x ? 1 0000 ? x ? 1 0001 ? x ? 1 4000 ? boot code & post & boot gfh x ? 1 4001 ? 16 k 8k x ? 1 6000 ? x ? 1 6001 ? post overflow & extended post 16 k x ? 1 a000 ? x ? 1 a001 ? other 24 k x ? 0ffff ? x ? 1 ffff ? image b image a
ibm powernp np4gs3 network processor preliminary serial / parallel manager interface page 358 of 554 np3_dl_sec09_spm.fm.08 may 18, 2001 9.3 spm interface protocol the spm interface operates synchronously with the 33-mhz clock signal output. data, address, and control information is transferred serially on a bidirectional data signal. transitions on this data signal occur at the rising edges of the clock signal. each data exchange is initiated by the np4gs3 and can be one to four bytes in length. figure 9-3 illustrates the timing of the spm interface. for single-byte transfers, the exchange begins when the np4gs3 drives the data signal to ? 1 ? for one clock period. this ? select ? indication is followed by a 1-bit write/read indication, a 4-bit burst length indication, and a 25-bit address value. for read and write transfers, the spm interface master waits for a response from the spm interface slave, and the slave communicates with the master using the following acknowledgments: ack: a ? 1 ? driven onto the data bus by the spm interface slave to indicate each successful byte operation, either read or write. ack :a ? 0 ? driven onto the data bus by the spm interface slave while the byte operation is in progress. for write transfers (see figure 9-4 ), the address is followed immediately by eight bits of data. the np4gs3 puts its driver in high-z mode. one clock period later, the slave drives the data signal to ? 0 ? (ack ) until the byte write operation is complete. the slave then drives the data signal to ? 1 ? (ack). during the byte write operation, the np4gs3 samples the data input looking for a ? 1 ? (ack). immediately following the ack, the slave puts its driver in high-z mode. the transfer is concluded one clock period later. figure 9-3. spm bit timing figure 9-4. spm interface write protocol t .25t clock ... data .25t .25t ... sel write ... ... idle sel write ... ... idle r/w ... onebytewrite two byte write (burst length = 0010) ... r/w ... ... ... zero or more ack s preceding one ack new sequence can begin here bl3 bl0 a24 a0 d7 d0 ts ack ts sel bl3 bl0 a24 a0 d7 ts ts ts ts d0 ack start d7 ack d0 sel note: bl# = bit # of the burst length a# = bit # of the address ack = acknowledgment sel = select
ibm powernp np4gs3 preliminary network processor np3_dl_sec09_spm.fm.08 may 18, 2001 serial / parallel manager interface page 359 of 554 for read transfers (see figure 9-5 ), the address is followed by the np4gs3 putting its driver into high-z mode. one clock period later, the slave drives the data signal to ? 0 ? (ack ) until the byte of read data is ready for transfer. the slave then drives the data signal to ? 1 ? (ack). during this time, the np4gs3 samples the data input looking for an ack. immediately following the ack, the slave drives the eight bits of data onto the data signal and then puts its driver in high-z mode. the transfer is concluded one clock period later. the protocol for multiple byte transfers is similar, except that each byte written is accompanied by a bus turn- around, zero or more ack s, and an ack. read bursts are characterized by the slave retaining bus ownership until the last byte of the burst is transferred. each successive byte read is preceded by at least one ? 0 ? on the data bus followed by one ? 1 ? on the data bus (ack), and immediately followed by the eight bits of data. figure 9-5. spm interface read protocol read sel ... ... idle ... r/w read sel ... ... idle ... ... ... r/w ... one byte read two byte read (burst length = 0010) one or more ack s preceding one ack zero or more ack s preceding one ack new sequence can begin here bl3 bl0 a24 a0 ts ack d7 d0 ts sel bl3 bl0 a24 a0 ack ts d7 d0 ack d0 ts sel note: bl# = bit # of the burst length a# = bit # of the address ack = acknowledgment sel = select d7
ibm powernp np4gs3 network processor preliminary serial / parallel manager interface page 360 of 554 np3_dl_sec09_spm.fm.08 may 18, 2001 9.4 spm cab address space the spm interface enables the control access bus (cab) to access an external eeprom and other devices attached via an external spm module developed by the customer. the address space is divided into three areas: byte access word access eeprom access 9.4.1 byte access space allelementsaccessedinbyteaccessspacearelimitedtoasinglebyteinwidth. 9.4.2 word access space all elements accessed in word access space are a word in width. access type r/w base addresses x ? 2800 0000 ? through x ? 281f ffff ? x ? 2880 0000 ? through x ? 28ff ffff ? byte data reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description byte data 31:24 data at this location reserved 23:0 reserved access type: r/w base addresses: x ? 2820 0000 ? through x ? 287f ffff ? word data 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description word data 31:0 data at this location
ibm powernp np4gs3 preliminary network processor np3_dl_sec09_spm.fm.08 may 18, 2001 serial / parallel manager interface page 361 of 554 9.4.3 eeprom access space the spm interface and a customer-supplied external spm module can be combined to provide access to an attached eeprom. the eeprom access space contains locations for 16 m 1-byte elements. all write accesses are limited to a single byte, but read accesses may be in bursts of 1, 2, 3, or 4 bytes. the cab address is formed using the field definitions shown in table 9-1 . 9.4.3.1 eeprom single-byte access addresses in this space are used for single-byte read or write access to the eeprom. table 9-1. field definitions for cab addresses bits description 31:27 ? 00101 ? 26:25 encoded burst length 00 4-byte burst (read only) 01 1-byte burst (read or write) 10 2-byte burst (read only) 11 3-byte burst (read only) 24 ? 1 ? 23:0 starting byte address for read or write action access type: r/w base addresses: x ? 2b00 0000 ? through x ? 2bff ffff ? byte data reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description byte data 31:24 data at starting byte address reserved 23:0 reserved
ibm powernp np4gs3 network processor preliminary serial / parallel manager interface page 362 of 554 np3_dl_sec09_spm.fm.08 may 18, 2001 9.4.3.2 eeprom 2-byte access addresses in this space are used for a 2-byte read burst access to the eeprom. 9.4.3.3 eeprom 3-byte access addresses in this space are used for a 3-byte read burst access to the eeprom. access type: read only base addresses: x ? 2d00 0000 ? through x ? 2dff ffff ? byte 0 data byte 1 data reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description byte 0 data 31:24 data at starting byte address byte 1 data 23:16 data at starting byte address + 1 reserved 15:0 reserved access type: read only base addresses: x ? 2f00 0000 ? through x ? 2fff ffff ? byte 0 data byte 1 data byte 2 data reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description byte 0 data 31:24 data at byte address byte 1 data 23:16 data at byte address + 1 byte 2 data 15:8 data byte address + 2 reserved 7:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec09_spm.fm.08 may 18, 2001 serial / parallel manager interface page 363 of 554 9.4.3.4 eeprom 4-byte access addresses in this space are used for a 4-byte read burst access to the eeprom. access type: read only base addresses: x ? 2900 0000 ? through x ? 29ff ffff ? byte 0 data byte 1 data byte 2 data byte 3 data 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description byte 0 data 31:24 data at byte address byte 1 data 23:16 data at byte address + 1 byte 2 data 15:8 data byte address + 2 byte 3 data 7:0 data byte address + 3
ibm powernp np4gs3 network processor preliminary serial / parallel manager interface page 364 of 554 np3_dl_sec09_spm.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 365 of 554 10. embedded powerpc ? subsystem 10.1 description the ibm powernp np4gs3 incorporates an embedded powerpc subsystem. this subsystem consists of mixture of macros from ibm ? s powerpc macro library and other components that were designed specifically for the np4gs3. standard ibm powerpc macros include the following: 133 mhz ppc405 processor core with 16 k of instruction cache and 16 k of data cache 133 mhz, 64-bit plb macro with plb arbiter 33/66 mhz, 32-bit pci to 133 mhz, 64-bit plb macro powerpc universal interrupt controller (uic) macro documentation for the above macros is contained in the ibm powerpc 405gp embedded processor user ? s manual ( http://www-3.ibm.com/chips/techlib/techlib.nsf/products_powerpc_405gp_embedded_processor ) and is not repeated here. that document also contains information for other macro library components not contained in the np4gs3 which does not apply to the above macros. the embedded powerpc subsystem includes a cab interface plb slave unit to access np4gs3 internal structures and a mailbox and dram interface plb slave unit for inter-processor communications and access to powerpc instructions. figure 10-1. powerpc subsystem block diagram powerpc 64-bit plb arbiter pci/plb mailbox & epc gph reset cntl to dram to p c i b u s cab 405 core interface dram i/f macro dram controller dram arbiter cab arbiter cab interrupts universal interrupt controller
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 366 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.2 processor local bus and device control register buses the on-chip bus structure consisting of the processor local bus (plb) and the device control register bus (dcr) provides a link between the processor core and the other peripherals (plb master and slave devices) used in powerpc subsystem design. the plb is the high performance bus used to access memory, pci devices, and np4gs3 structures through the plb interface units. the plb interface units shown in figure 10-1 , the cab interface and the mailbox & dram interface, are plb slaves. the processor core has two plb master connections, one for instruction cache and one for data cache. the pci to plb interface unit, which is both a plb master and plb slave device, is also attached to the plb. the plb master corresponds to the pci target and the plb slave corre- sponds to the pci master. each plb master is attached to the plb through separate address, read data, and write data buses and a plurality of transfer qualifier signals. plb slaves are attached to the plb through shared, but decoupled, address, read data, and write data buses and a plurality of transfer control and status signals for each data bus. access to the plb is granted through a central arbitration mechanism that allows masters to compete for bus ownership. this mechanism is flexible enough to provide for the implementation of various priority schemes. additionally, an arbitration locking mechanism is provided to support master-driven atomic operations. the plb is a fully-synchronous bus. timing for all plb signals is provided by a single clock source that is shared by all masters and slaves attached to the plb. all plb arbiter registers are device control registers. they are accessed by using the move from device control register (mfdcr) and move to device control register (mtdcr) instructions. plb arbiter registers are architected as 32-bits and are privileged for both read and write. the dcr base address of the plb registers is x'080'. the dcr base address of the uic registers is x'0c0'. details regarding the plb and uic device control registers can be found in the ibm powerpc 405gp embedded processor user ? s manual ( http://www- 3.ibm.com/chips/techlib/techlib.nsf/products_powerpc_405gp_embedded_processor ). the device control register (dcr) bus is used primarily to access status and control registers within the plb and the universal interrupt controller (uic). the dcr bus architecture allows data transfers among peripherals to occur independently from, and concurrent with, data transfers between the processor and memory or among other plb devices. table 10-1. plb master connections master id master unit description 0 processor core instruction cache unit 1 processor core data cache unit 2 plb/pci macro unit others unused
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 367 of 554 10.3 universal interrupt controller (uic) the universal interrupt controller (uic) provides all the necessary control, status, and communication between the various of interrupts sources and the microprocessor core. the uic supports six on-chip and two external sources of interrupts. status reporting (using the uic status register (uicsr)) is provided to ensure that systems software can determine the current and interrupting state of the system and respond appropri- ately. software can generate interrupts to simplify software development and for diagnostics. the interrupts can be programmed, using the uic critical register (uiccr), to generate either a critical or a non-critical interrupt signal. the uic supports internal and external interrupt sources as defined in table 10-2. the on-chip interrupts (interrupts 0, and 3-7) and the external interrupts (interrupts 1-2) are programmable. however, the on-chip interrupts must be programmed as shown in table 10-2. for details regarding the control of the uic, including the programming of interrupts, see the ibm powerpc 405gp embedded processor user ? s manual ( http://www-3.ibm.com/chips/techlib/techlib.nsf/ products_powerpc_405gp_embedded_processor ). table 10-2. uic interrupt assignments interrupt polarity sensitivity interrupt source 0 high edge dram d6 parity error 1 programmable programmable external interrupt 0 (pci_bus_nm_int input pin) 2 programmable programmable external interrupt 1 (pci_bus_m_int input pin) 3 high level pci host to powerpc doorbell interrupt 4 high level pci host to powerpc message interrupt 5 high level embedded processor complex to powerpc doorbell interrupt 6 high level embedded processor complex to powerpc message interrupt 7 high level pci command write interrupt generated when an external pci master writes to the pci command register or bit 13 of the bridge options 2 register is set to ? 1 ? via pci configuration. see description of the pci macro ? s bridge options 2 register in the ppc405gp embedded control- ler user ? s manual for details. 8-31 high level unused, interrupt input to uic tied low.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 368 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.4 pci/plb macro the peripheral component interconnect (pci) interface controller provides an interface for connecting plb- compliant devices to pci devices. the controller complies with pci specification, version 2.2. ( http:// www.pcisig.com ). values of the pci device configuration header for the np4gs3 are initialized by hardware.these values are shown in table 10-3 . when the boot picocode is loaded from the management bus (boot_picocode is tied to ? 0 ? ) and the powerpc subsystem boots from dram d6 (boot_ppc is tied to ? 0 ? ,see 2.1.8 miscellaneous pins on page 77 ), a general reset initializes the pci/plb macro ? s bridge options 2 register (pcibrdgopt2) with its host configuration enable bit set to ? 0 ? (disabled). this allows the powerpc subsystem to alter the contents of the pci device configuration header registers prior to access by external configuration. the powerpc code then enables external host configuration. the pci/plb macro responds as a target on the plb bus in several address ranges. these ranges allow a plb master to configure the pci/plb macro, and to cause the pci/plb macro to generate memory, i/o, configuration, interrupt acknowledge, and special cycles to the pci bus. table 10-4 shows the address map from the view of the plb, that is, as decoded by the pci/plb macro as a plb slave. table 10-3. np4gs3 pci device configuration header values register name register value vendor id x ? 1014 ? device id x ? 01e8 ? revision id x ? 00 ? class code x ? 028000 ? subsystem id x ? 0000 ? subsystem vendor id x ? 0000 ? table 10-4. plb address map for pci/plb macro (page 1 of 2) plb address range description pci address range x ? e800 0000 ? -x ? e800 ffff ? pci i/o accesses to this range are translated to an i/o access on pci in the range 0 to 64 kb - 1 x ? 0000 0000 ? -x ? 0000 ffff ? x ? e801 0000 x ? e87f ffff ? reserved pci/plb macro does not respond (other bridges use this space for non-contiguous i/o). x ? e880 0000 ? -x ? ebff ffff ? pci i/o accesses to this range are translated to an i/o access on pci in the range 8 mb to 64 mb - 1 x ? 0080 0000 ? -x ? 03ff ffff ? x ? ec00 0000 ? -x ? eebf ffff ? reserved pci macro does not respond x ? eec0 0000 ? -x ? eecf ffff ? pcicfgadr and pcicfgdata x ? eec0 0000 ? : config_address x ? eec0 0004 ? : config_data x ? eec0 0008 ? -x ? eecf ffff ? : reserved (can mirror pcicf- gadr and pcicfgdata)
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 369 of 554 following a general reset of the np4gs3, the pci target map 1 is enabled for a pci address range of 128 kb and is mapped to the plb address range of x ? 7800 0000 to x ? 7801 ffff ? . the corresponding pci base address for this range must be set by pci configuration of the pci ptm1 base address register. likewise, the pci target map 2 is enabled for a pci address range of 128 mb and is mapped to the plb address range of x ? 0000 0000 to x ? 07ff ffff ? . the corresponding pci base address for this range must be set by pci configuration of the pci ptm2 base address register. the pci/plb macro has a mode that enables a plb master to access a pci memory range without initial configuration cycles. this mode is enabled by strapping the boot_ppc input pin high. system designers, for instance, may use this mode to allow a processor to access a boot rom in pci memory space. in this mode the pci/plb macro comes out of reset with pmm0 enabled and programmed for the address range x ? fffe 0000 ? -x ? ffff ffff ? . the me field of the pci command register (pcicmd[me]) is also set to 1 after reset. enabling pci boot mode does not prevent subsequent updates to the pmm0 registers. x ? eed0 0000 ? -x ? eedf ffff ? pci interrupt acknowledge and special cycle x ? eed0 0000 ? read: interrupt acknowledge x ? eed0 0000 ? write: special cycle x ? eed0 0004 ? -x ? eedf ffff ? : reserved (can mirror interrupt acknowledge and special cycle) x ? eee0 0000 ? -x ? ef3f ffff ? reserved pci/plb macro does not respond x ? ef40 0000 ? -x ? ef4f ffff ? pci/plb macro local configuration registers x ? ef40 0000 ? : pmm0la x ? ef40 0004 ? :pmm0ma x ? ef40 0008 ? : pmm0pcila x ? ef40 000c ? :pmm0pciha x ? ef40 0010 ? : pmm1la x ? ef40 0014 ? :pmm1ma x ? ef40 0018 ? : pmm1pcila x ? ef40 001c ? :pmm1pciha x ? ef40 0020 ? : pmm2la x ? ef40 0024 ? :pmm2ma x ? ef40 0028 ? : pmm2pcila x ? ef40 002c ? :pmm2pciha x ? ef40 0030 ? :ptm1ms x ? ef40 0034 ? : ptm1la x ? ef40 0038 ? :ptm2ms x ? ef40 003c ? : ptm2la x ? f400 0400 ? -x ? ef4f ffff ? : reserved (can mirror pci local registers) x ? 0000 0000 ? -x ? ffff ffff ? pci memory - range 0 pmm 0 registers map a region in plb space to a region in pci memory space. the address ranges are fully programmable. the pci address is 64 bits. x ? 0000 0000 0000 0000 ? x ? ffff ffff ffff ffff ? x ? 0000 0000 ? -x ? ffff ffff ? pci memory - range 1 pmm 1 registers map a region in plb space to a region in pci memory space. the address ranges are fully programmable. the pci address is 64 bits. x ? 0000 0000 0000 0000 ? x ? ffff ffff ffff ffff ? x ? 0000 0000 ? -x ? ffff ffff ? pci memory - range 2 pmm 2 registers map a region in plb space to a region in pci memory space. the address ranges are fully programmable. the pci address is 64 bits. x ? 0000 0000 0000 0000 ? x ? ffff ffff ffff ffff ? table 10-4. plb address map for pci/plb macro (page 2 of 2) plb address range description pci address range
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 370 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 unless the boot picocode is loaded from the spm (boot_picocode tied to ? 0 ? ) and the powerpc subsystem boots from dram d6 (boot_ppc tied to ? 0 ? ), a general reset initializes the pci/plb macro ? s bridge options 2 register (pcibrdgopt2) with its host configuration enable bit set to ? 1 ? (enabled). this allows an external source to access the pci/plb macro ? s configuration registers. otherwise, powerpc code must enable external host configuration and may alter the contents of the pci device configuration header registers prior to enabling external host configuration. for further details regarding the pci/plb macro ? s control and configuration registers, see the ibm powerpc 405gp embedded processor user ? s manual ( http://www-3.ibm.com/chips/techlib/techlib.nsf/ products_powerpc_405gp_embedded_processor ).
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 371 of 554 10.5 plb address map components of the embedded powerpc subsystem are connected using the processor local bus (plb). these components recognize plb address values as their own. these plb address values are fixed by hardware. the plb address map describes the association of plb address values and the components that recognize them. table 10-5. plb address map (page 1 of 2) symbolic address plb address description access cab interface macro pwrpc_cab_addr x ? 7800 0000 ? powerpc cab address register r/w pwrpc_cab_data x ? 7800 0008 ? powerpc cab data register r/w pwrpc_cab_cntl x ? 7800 0010 ? powerpc cab control register r/w pwrpc_cab_status x ? 7800 0018 ? powerpc cab status register r pwrpc_cab_mask x ? 7800 0020 ? powerpc cab mask register [np4gs3b (r2.0) only] r/w pwrpc_cab_wum_ data x ? 7800 0028 ? powerpc cab write under mask data register [np4gs3b (r2.0) only] w host_cab_addr x ? 7800 8000 ? pci host cab address register r/w host_cab_data x ? 7800 8008 ? pci host cab data register r/w host_cab_cntl x ? 7800 8010 ? pci host cab control register r/w host_cab_status x ? 7800 8018 ? pci host cab status register r host_cab_mask x ? 7800 8020 ? pci host cab mask register [np4gs3b (r2.0) only] r/w host_cab_wum_da ta x ? 7800 8028 ? pci host cab write under mask data register [np4gs3b (r2.0) only] w unassigned addresses in the range x ? 7800 0000 ? -x ? 7800 ffff ? are reserved mailbox and dram interface macro pci_interr_status x ? 7801 0000 ? pci interrupt status register r pci_interr_ena x ? 7801 0008 ? pci interrupt enable register r/w p2h_msg_resource x ? 7801 0010 ? powerpc subsystem to pci host resource register r/w 1 p2h_msg_addr x ? 7801 0018 ? powerpc subsystem to pci host message address register r/w p2h_doorbell x ? 7801 0020 ? powerpc subsystem to pci host doorbell register (powerpc access) r/sum x ? 7801 0028 ? powerpc subsystem to pci host doorbell register (pci host access) r/rum h2p_msg_addr x ? 7801 0050 ? pci host to powerpc subsystem message address register (reset status) r 1 x ? 7801 0060 ? pci host to powerpc subsystem message address register r/w h2p_doorbell x ? 7801 0030 ? pci host to powerpc subsystem doorbell register (pci host access) r/sum x ? 7801 0038 ? pci host to powerpc doorbell register (powerpc access) r/rum e2p_msg_resource x ? 7801 0040 ? epc to powerpc subsystem message resource register r/w e2p_msg_addr x ? 7801 0048 ? epc to powerpc subsystem message address register r 1 e2p_doorbell x ? 7801 0058 ? epc to powerpc subsystem doorbell register (powerpc access) r/rum p2e_msg_addr x ? 7801 0068 ? powerpc subsystem to epc message address register r/w unassigned addresses in the range x ? 7801 0000 ? -x ? 7801 ffff ? are reserved 1. additional action occurs on register access using the specified address. refer to register detailed section for more information.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 372 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.6 cab address map some components of the embedded powerpc subsystem are also accessible via the np4gs3 ? s cab inter- face. these components are accessed using cab addresses as shown in the cab address map. p2e_doorbell x ? 7801 0070 ? powerpc subsystem to epc doorbell register (powerpc access) r/sum e2h_msg_resource x ? 7801 0080 ? epc to pci host message resource register r/w e2h_msg_addr x ? 7801 0088 ? epc to pci host message address register r e2h_doorbell x ? 7801 0098 ? epc to pci host doorbell register (pci host access) r/rum h2e_msg_addr x ? 7801 00a8 ? pci host to epc message address register r/w msg_status x ? 7801 00a0 ? message status register r h2e_doorbell x ? 7801 00b0 ? pci host to epc doorbell register (pci host access) r/sum sear x'7801 00b8' slave error address register r sesr x'7801 00c0' slave error status register rwr perr_cntr x'7801 00c8' parity error counter register r pwrpc_inst_store x ? 0000 0000 ? - x ? 07ff ffff ? powerpc instruction dram r/w table 10-6. cab address map (page 1 of 2) symbolic address cab address description access mailbox and dram interface macro boot_redir_inst x ? 3800 0110 ? - x ? 3800 0117 ? boot redirection instruction registers for instruction addresses x ? ffff ffe0 ? -x ? ffff fffc ? r/w pwrpc_mach_chk x ? 3800 0210 ? powerpc subsystem machine check register r e2p_msg_resource x ? 3801 0010 ? epc to powerpc subsystem message resource register r 1 e2p_msg_addr x ? 3801 0020 ? epc to powerpc subsystem message address register r/w e2p_doorbell x ? 3801 0040 ? epc to powerpc doorbell register (powerpc access) r/sum p2e_msg_addr x ? 3801 0080 ? powerpc subsystem to epc message address register r x ? 3802 0010 ? powerpc subsystem to epc message address register r 1 p2e_doorbell x ? 3801 0100 ? powerpc subsystem to epc doorbell register (powerpc access) r/rum e2h_msg_resource x ? 3801 0200 ? epc to pci host message resource register r 1 e2h_msg_addr x ? 3801 0400 ? epc to pci host message address register r/w e2h_doorbell x ? 3801 0800 ? epc to pci host doorbell register (pci host access) r/sum 1. additional action occurs on register access using the specified address. refer to register detailed section for more information. table 10-5. plb address map (page 2 of 2) symbolic address plb address description access unassigned addresses in the range x ? 7801 0000 ? -x ? 7801 ffff ? are reserved 1. additional action occurs on register access using the specified address. refer to register detailed section for more information.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 373 of 554 h2e_msg_addr x ? 3801 1000 ? pci host to epc message address register r x ? 3802 0020 ? pci host to epc message address register r 1 h2e_doorbell x ? 3801 2000 ? pci host to epc doorbell register (pci host access) r/rum msg_status x ? 3801 4000 ? message status register r table 10-6. cab address map (page 2 of 2) symbolic address cab address description access 1. additional action occurs on register access using the specified address. refer to register detailed section for more information.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 374 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7 cab interface macro the cab interface macro provides duplicate facilities to support independent cab access of ibm network processor control and status facilities by the pci host processor and the embedded powerpc subsystem. exclusive access to these facilities, if required, is enforced through software discipline. the pci host processor can access the ibm network processor ? s cab interface through the following mech- anism: after pci configuration, one or more ranges of pci addresses are mapped to plb addresses. accessing these pci addresses also accesses powerpc plb resources, which include the following cab interface registers: cab address register is set to the value of the cab address to be read or written. cab control register is written with parameters that control the behavior for cab access. cab data register, when accessed, initiates a cab access and determines its type (read or write). if the cab data register is written, then the cab access will be a write access. likewise reading the cab data register will result in a cab read access. status register is read to determine, for polled access, whether read data is ready (rd_rdy). the cab control register (w/p ) controls the two modes of plb protocol for cab access: wait access, w/p = ? 1 ? , causes the cab interface macro to insert wait states on the plb until the cab access is complete. software need not read the cab status register to determine completion. cab access in polled mode requires software to follow the protocol defined in figure 10-2: polled access flow diagram on page 375. behavior for cab accesses not following this protocol is undefined and may result in adverse effects. polled access, w/p = ? 0 ? requires software to read the cab status register to synchronize software with the hardware when performing a cab transaction. a cab read transaction requires at least two read accesses to the cab_data register. an additional read access of the cab_data register may be required if the software is not synchronized with the hardware. software syn- chronization is determined initially by reading the cab_status register. the software is not synchronized when the rd_rdy status bit is set to ? 1 ? . synchronization is achieved by reading the cab_data register and discarding the result. a subsequent read of the cab data register initiates a cab read access from the cab address con- tained in the cab address register. the data returned as a result of this read of the cab_data register is stale data and is discarded by software. software must then perform a final read of the cab_data regis- ter to acquire the data accessed on the cab and return the cab interface to its starting state (rd_rdy = ? 0 ? ). np4gs3a (r1.1) requires reading of the cab_status register as a part of the cab access protocol. np4gs3b (r2.0) causes subsequent accesses to the cab interface registers to be retried until a pend- ing cab access is complete. prolonged locking of the cab by the epc will result in extended retries by the powerpc subsystem or pci host. under these conditions, system hardware and software design must be able to tolerate long periods of retry.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 375 of 554 h figure 10-2. polled access flow diagram [r2.0] start read cab_status read cab_data and discard results read cab_data and discard results y n rd_rdy = 1 cab read or write wr rd read cab_data (sets rd_rdy to ? 0 ? ) finish write cab_data with data value [r2.0] rd_rdy = ? 1 ? hardware/software synchronization rd_rdy = 1 y n cab read pending read cab_status [r1.1] pending cab read cab i/f access retried [r2.0] pending cab write cab i/f access retried busy = 0 y n cab write pending read cab_status [r1.1]
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 376 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7.1 powerpc cab address (pwrpc_cab_addr) register the powerpc cab address register is accessible from the plb and supplies a cab address value for powerpc subsystem access to np4gs3 structures via the cab interface. 10.7.2 powerpc cab data (pwrpc_cab_data) register the powerpc cab data register is accessible from the plb and contains the value of cab data written or read when the powerpc subsystem accesses np4gs3 structures via the cab interface. writing to the pwrpc_cab_data register has the side effect of initiating a write access on the cab. the data value written to the pwrpc_cab_data register is written to the cab address contained in the pwrpc_cab_addr register. when the pwrpc_cab_cntl register has been configured with its w/p bit set to ? 1 ? or if its w/p bit is set to ? 0 ? and the rd_rdy bit of the pwrpc_cab_status register is set to ? 0 ? , a read of the pwrpc_cab_data register has the side effect of initiating a corresponding read access on the cab. at the end of the cab read access, the data value indicated by the pwrpc_cab_addr register is stored in the pwrpc_cab_data register and the rd_rdy bit of the pwrpc_cab_status register is set to '1. when the w/p bit set to ? 1 ? , the data value is also returned to the plb. otherwise, a subsequent read access of the pwrpc_cab_data register is required to retrieve the data value. access type read/write base address (plb) x ? 7800 0000 ? pwrpc_cab_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description pwrpc_cab_addr 0:31 x ? 0000 0000 ? cab address value for powerpc subsystem access to np4gs3 structures via the cab interface. access type read/write base address (plb) x ? 7800 0008 ? pwrpc_cab_data 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description pwrpc_cab_data 0:31 x ? 0000 0000 ? cab data written or read when the powerpc subsystem accesses np4gs3 structures via the cab interface.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 377 of 554 10.7.3 powerpc cab control (pwrpc_cab_cntl) register the powerpc cab control register is accessible from the plb and controls the cab access protocol used by the powerpc subsystem. the bit in this register indicates whether the wait or polled protocol is used when the powerpc subsystem accesses cab connected structures within the np4gs3. access type read/write base address (plb) x ? 7800 0010 ? reserved w/p 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description reserved 0:30 reserved w/p 31 0 wait or polled access control value 0 powerpc subsystem polls the pwrpc_cab_status register to determine when access is complete 1 cab interface macro inserts wait states on the plb until the cab access is complete
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 378 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7.4 powerpc cab status (pwrpc_cab_status) register the powerpc cab status register is accessible from the plb and monitors the status of powerpc cab accesses. bits within this register indicate the status of powerpc subsystem accesses of cab connected structures within the np4gs3. np4gs3a (r1.1) np4gs3b (r2.0) np4gs3a (r1.1) np4gs3b (r2.0) access type read base address (plb) x ? 7800 0018 ? reserved rd_rdy busy 012345678910111213141516171819202122232425262728293031 reserved rd_rdy reserved 012345678910111213141516171819202122232425262728293031 field name bit(s) description reserved 0:29 reserved rd_rdy 30 read data ready indicator (used for polled access mode w/p = ? 0 ? only) 0 no data in cab data register, a new cab access can begin. 1 data from a cab read access is waiting in the cab data register busy 31 busy indicator value. 0 cab access is not pending. set when cab access is complete. 1 cab access is pending. set when cab access (read or write) is initiated. field name bit(s) description reserved 0:29 reserved rd_rdy 30 read data ready indicator (used for polled access mode w/p = ? 0 ? only) 0 no data in cab data register, a new cab access can begin. 1 data from a cab read access is waiting in the cab data register reserved 31 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 379 of 554 10.7.5 powerpc cab mask (pwrpc_cab_mask) register [np4gs3b (r2.0) only] the powerpc cab mask (pwrpc_cab_mask) register is accessible from the plb and supplies a mask valueusedinconjunctionwithawrite-under-maskcabaccess.each ? 1 ? bit of the mask indicates which bits of the cab register will be updated. the corresponding bit value of the pwrpc_cab_wum_data register is used to update the cab register. all other bits of the cab register will remain unaltered. hardware reset x ? 0000 0000 ? plb address x ? 7800 0020 ? powerpc cab mask register access type read/write cab_mask 012345678910111213141516171819202122232425262728293031 field name bit(s) description pwrpc_cab_mask 0:31 cab mask value for powerpc subsystem access to ibm network processor structures via the cab interface. each ? 1 ? bit of the mask indicates which bits of the cab register will be updated.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 380 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7.6 powerpc cab write under mask data (pwrpc_cab_wum_data) [np4gs3b (r2.0) only] the powerpc cab data (pwrpc_cab_wum_data) register is accessible from the plb and contains the value of cab data written when the powerpc subsystem accesses ibm network processor structures via the cab interface using the write-under-mask function. writing to the pwrpc_cab_wum_data register has the side effect of initiating a write-under-mask access on the cab. the data value written to the pwrpc_cab_wum_data register is combined with the contents of the pwrpc_cab_mask register and the contents of the cab register indicated by the pwrpc_cab_addr register. for each bit location in which the value of the pwrpc_cab_mask register is ? 1 ? , the corresponding bit of the cab register is updated with the contents of the pwrpc_cab_wum_data register. all other bits of the cab register remain unaltered. the bits of this register are shared with the pwrpc_cab_data register. writing to this register will alter the contents of the pwrpc_cab_data register. hardware reset x ? 0000 0000 ? plb address x ? 7800 0028 ? powerpc cab wum data register access type write cab_wum_data 012345678910111213141516171819202122232425262728293031 field name bit(s) description pwrpc_cab_wum_data 0:31 cab data to be combined with the pwrpc_cab_mask register when the powerpc sub- system accesses ibm network processor structures via the cab interface in write- under_mask mode.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 381 of 554 10.7.7 pci host cab address (host_cab_addr) register the pci host cab address register is accessible from the plb and supplies a cab address value for pci host access to np4gs3 structures via the cab interface. 10.7.8 pci host cab data (host_cab_data) register the pci host cab data register is accessible from the plb and contains the value of cab data written or read when the pci host accesses np4gs3 structures via the cab interface. writing to the host_cab_data register has the side effect of initiating a write access on the cab. the data value written to the host_cab_data register is written to the cab address contained in the host_cab_addr register. when the host_cab_cntl register has been configured with its w/p bit set to ? 1 ? or if its w/p bit is set to ? 0 ? and the rd_rdy bit of the host_cab_status register is set to ? 0 ? , a read of the host_cab_data register has the side effect of initiating a corresponding read access on the cab. at the end of the cab read access, the data value indicated by the host_cab_addr register is stored in the host_cab_data register and the rd_rdy bit of the host_cab_status register is set to '1. when the w/p bit set to ? 1 ? , the data value is also returned to the plb. otherwise, a subsequent read access of the host_cab_data register is required to retrieve the data value. access type read/write base address (plb) x ? 7800 8000 ? host_cab_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description host_cab_addr 0:31 x ? 0000 0000 ? cab address value for pci host access to np4gs3 structures via the cab interface. access type read/write base address (plb) x ? 7800 8008 ? host_cab_data 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description host_cab_data 0:31 x ? 0000 0000 ? cab data written or read when the pci host accesses np4gs3 struc- tures via the cab interface.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 382 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7.9 pci host cab control (host_cab_cntl) register the pci host control register is accessible from the plb and controls the cab access protocol used by the pci host. the bit in this register indicates whether the wait or polled protocol is used when the pci host accesses cab connected structures within the np4gs3. reserved w/p 012345678910111213141516171819202122232425262728293031 access type read/write base address (plb) x ? 7800 8010 ? field name bit(s) reset description reserved 0:30 reserved w/p 31 0 wait or polled access control value 0 pci host polls the host_cab_status register to determine when access is complete 1 cab interface macro inserts wait states on the plb until the cab access is complete
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 383 of 554 10.7.10 pci host cab status (host_cab_status) register the pci host cab status register is accessible from the plb and monitors the status of pci host cab accesses. bits within this register indicate the status of pci host accesses of cab connected structures within the np4gs3. np4gs3a (r1.1) np4gs3b (r2.0) np4gs3a (r1.1) np4gs3b (r2.0) access type read base address (plb) x ? 7800 8018 ? reserved rd_rdy busy 012345678910111213141516171819202122232425262728293031 reserved rd_rdy reserved 012345678910111213141516171819202122232425262728293031 field name bit(s) description reserved 0:29 reserved rd_rdy 30 read data ready indicator (used for polled access mode w/p = ? 0 ? only) 0 no data in cab data register, a new cab access can begin. 1 data from a cab read access is waiting in the cab data register [r1.1]:busy 31 busy. busy indicator value. 0 cab access is not pending. set when cab access is complete. 1 cab access is pending. set when cab access (read or write) is initiated. field name bit(s) description reserved 0:29 reserved rd_rdy 30 read data ready indicator (used for polled access mode w/p = ? 0 ? only) 0 no data in cab data register, a new cab access can begin. 1 data from a cab read access is waiting in the cab data register reserved 31 reserved
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 384 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.7.11 pci host cab mask (host_cab_mask) register (np4gs3b (r2.0) only) the pci host cab mask (pwrpc_cab_mask) register is accessible from the plb and supplies a mask value used in conjunction with a write-under-mask cab access. each ? 1 ? bit of the mask indicates which bits of the cab register will be updated. the corresponding bit value of the host_cab_wum_data register is used to update the cab register. all other bits of the cab register will remain unaltered. 10.7.12 pci host cab write under mask data (host_cab_wum_data) register (np4gs3b (r2.0) only) the pci host cab data (host_cab_wum_data) register is accessible from the plb and contains the value of cab data written when the pci host accesses ibm network processor structures via the cab interface using the write-under-mask function. writing to the host_cab_wum_data register has the side effect of initi- ating a write-under-mask access on the cab. the data value written to the pwrpc_cab_wum_data register is combined with the contents of the pwrpc_cab_mask register and the contents of the cab register indi- cated by the pwrpc_cab_addr register. for each bit location in which the value of the host_cab_mask reg- ister is ? 1 ? , the corresponding bit of the cab register is updated with the contents of the host_cab_wum_data register. all other bits of the cab register remain unaltered. the bits of this register are shared with the host_cab_data register. writing to this register will alter the con- tents of the host_cab_data register. hardware reset x ? 0000 0000 ? plb address x ? 7800 8020 ? pci host cab mask register access type read/write cab_mask 012345678910111213141516171819202122232425262728293031 field name bit(s) description host_cab_mask 0:31 cab mask value for pci host access to ibm network processor structures via the cab interface. each ? 1 ? bit of the mask indicates which bits of the cab register will be updated. hardware reset x ? 0000 0000 ? plb address x ? 7800 8028 ? pci host cab wum data register access type write cab_wum_data 012345678910111213141516171819202122232425262728293031 field name bit(s) description host_cab_wum_data 0:31 cab data to be combined with the host_cab_mask register when the pci host accesses ibm network processor structures via the cab interface in write-under_mask mode.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 385 of 554 10.8 mailbox communications and dram interface macro the mailbox and dram interface macro consists of two major portions. the first portion is a set of facilities for constructing and signalling messages between the various processors. a set of message resource regis- ters allocates buffers for message construction, a set of message address registers accomplishes message signalling, and a set of doorbell registers accomplishes other inter-processor signalling. the second portion is hardware that interfaces the np4gs3 ? s dram controller to the plb. the dram interface maps a range of plb addresses into dram addresses. dram connected via this interface stores powerpc instructions, message data, and other data associated with the powerpc subsystem. the mailbox and dram interface macro also provides redirection of boot code. this function is used in system implementations in which the np4gs3 does not boot from pci memory. in these cases, hardware decodes the first powerpc instruction fetch and supplies up to eight instructions from registers internal to the mailbox and dram interface. these registers are accessible via the cab and are loaded by software prior to releasing the powerpc processor ? s reset (pwrpc_reset). code stored in these registers redirects execution to locations in the dram. 10.8.1 mailbox communications between pci host and powerpc subsystem communication between the pci host and the powerpc subsystem is accomplished through pci host to powerpc subsystem (h2p) and powerpc subsystem to pci host (p2h) interrupts. the p2h interrupt is implemented by asserting the np4gs3 ? s inta# signal output. pci interrupts are level sensitive and are asyn- chronous with the pci_clk signal. the existing pci macro ? s inta# signal is supplemented with other inter- rupt generation outside of the pci macro. using the pci macro, the powerpc subsystem can send an interrupt to the pci host by writing to the pci/plb macro ? s pci interrupt control/status register. this interrupt signal is recorded in the pci interrupt status register. doorbell and message register operations are addi- tional sources of pci interrupts. the pci macro can interrupt the powerpc subsystem by setting bit 13 of the pci macro ? s bridge options 2 register. this interrupt signal is applied to the powerpc universal interrupt controller (uic). communications between the pci host and the powerpc subsystem use message buffers in pci address space. software running in the pci host manages these buffers. for communications from the powerpc subsystem to the pci host, the starting address of empty message buffers are stored in the p2h message resource (p2h_msg_resource) register. this register is accompanied by a p2h_bufr_valid status flag, located in the msg_status register, that indicates whether or not the p2h_msg_resource register contains a valid buffer address. the pci host writes the p2h_msg_resource register with the pci address of an empty message buffer and the valid indicator flag is set. the powerpc subsystem reads the flag value when a message buffer is required and then reads the valid message buffer address value from the p2h_msg_resource register. reading the p2h_msg_resource register resets the valid indicator bit. by polling this indicator bit, the pci host knows when to replenish the p2h_msg_resource register with a new buffer address value. having acquired a message buffer, the powerpc subsystem composes a message in the buffer and writes the buffer ? s starting address value into the p2h_msg_addr register. this write also sets the pci_interr_status register ? s p2h_msg bit. a pci interrupt is generated when the corresponding bit of the pci_interr_ena register is set. the pci host reads the p2h_msg_addr register to find and process the message. the read clears the interrupt condition. for communication from the pci host to the powerpc, messages are composed in buffers in the pci address space. each message is then signalled to the powerpc subsystem by writing its starting pci address value into the h2p_msg_addr register. writing this register sets the h2p_msg_interr to the
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 386 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 powerpc uic and sets the h2p_msg_busy bit of the msg_status register. an interrupt is generated when enabled by the uic. the powerpc subsystem reads the h2p_msg_addr register at one address location to find and process the message and, due to the read, the interrupt condition is cleared. a subsequent read of the h2p_msg addr register at another address location will reset the h2p_msg_busy bit of the msg_status register. this second read signals the pci host that the powerpc subsystem has finished processing the data buffer, allowing it to be reused. the p2h_msg_addr register is written by the powerpc subsystem and read by the pci host processor. whenever the powerpc subsystem writes the p2h_msg_addr register, a bit in the pci interrupt status register is set to ? 1 ? and is independent of the value written. the pci interrupt status register indicates the source of the pci interrupt. the pci interrupt status register bit is reset to '0' when the pci host processor reads the p2h_msg_addr register. software discipline controls the setting of the interrupt by the powerpc subsystem and the resetting of the interrupt by the pci host (only the powerpc subsystem writes this register and only the pci host reads this register). the doorbell register is written and read by either the powerpc subsystem or the pcihost processor. the value recorded in this register depends upon the data value to be written, the current contents of the register, and whether the powerpc subsystem or the pci host processor is writing the register. when written by the powerpc subsystem, each bit of the register ? s current contents is compared with the corresponding data bit to be written. if the value of the data bit is ? 1 ? , then the corresponding doorbell register is set to ? 1 ? . otherwise it remains unchanged ( ? 0 ? or ? 1 ? ). this effect is referred to as set-under-mask (sum). when written by the pci host processor, each bit of the register ? s current contents is compared with the corresponding data bit to be written. if the value of the data bit is ? 1 ? , then the corresponding doorbell register bit is reset to ? 0 ? . otherwise, it remains unchanged ( ? 0 ? or ? 1 ? ). this effect is referred to as reset-under-mask (rum). if one or more of the bits in the doorbell register are ? 1 ? , then a signal is generated and stored in the pci interrupt status register. any of the signals recorded as ? 1 ? in the pci interrupt status register activates the inta# signal if the corre- sponding condition is enabled in the np4gs3 ? s pci interrupt enable register. the pci host processor reads the pci interrupt status register to determine the interrupt source.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 387 of 554 10.8.2 pci interrupt status (pci_interr_status) register the pci interrupt status register is accessible from the plb and records the source of the pci interrupts generated by the np4gs3. this register ? s bits are set by hardware and are read by pci host software. when the interrupt source is cleared, the corresponding bit of the pci_interr_status register is also cleared. access type read only base address (plb) x ? 7801 0000 ? reserved pci_macro reserved e2h_db e2h_msg p2h_db p2h_msg 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description reserved 0:23 reserved pci_macro 24 0 pci interrupt from macro indicator. a pci interrupt is asserted by the plb to pci macro when bit 0 of the macro's pciics register is set by software to a value of '1'. see the ibm powerpc 405gp embedded processor user ? s manual for details. 0 interrupt absent 1 interrupt present reserved 25:27 reserved e2h_db 28 0 epc to pci host doorbell indicator. 0 pci interrupt from e2h_doorbell register absent 1 pci interrupt from e2h_doorbell register present e2h_msg 29 0 epc to pci host message indicator. 0 pci interrupt from e2h_message register absent 1 pci interrupt from e2h_message register present p2h_db 30 0 powerpc subsystem to pci host doorbell indicator. 0 pci interrupt from p2h_doorbell register absent 1 pci interrupt from p2h_doorbell register present p2h_msg 31 0 powerpc subsystem to pci host message indicator. 0 pci interrupt from p2h_message register absent 1 pci interrupt from p2h_message register present
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 388 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.3 pci interrupt enable (pci_interr_ena) register the pci interrupt enable register is accessible from the plb and enables pci interrupts from sources within the np4gs3. access type read/write base address (plb) x ? 7801 0008 ? reserved pci_macro reserved e2h_db e2h_msg p2h_db p2h_msg 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description reserved 0:23 reserved pci_macro 24 0 pci macro interrupt - controls the assertion of a pci interrupt from the pci macro. a pci interrupt is asserted by the plb to pci macro when bit 0 of the macro's pciics register is set by software to a value of '1'. see the ibm powerpc 405gp embedded processor user ? s manual for details. 0 interrupt disabled 1 interrupt enabled reserved 25:27 reserved e2h_db 28 0 epc to pci host doorbell interrupt - controls the assertion of a pci inter- rupt from the e2h_doorbell register. 0 interrupt disabled 1 interrupt enabled e2h_msg 29 0 epc to pci host message interrupt - controls the assertion of a pci inter- rupt from the e2h_message register. 0 interrupt disabled 1 interrupt enabled p2h_db 30 0 powerpc subsystem to pci host doorbell interrupt - controls the asser- tion of a pci interrupt from the p2h_doorbell register. 0 interrupt disabled 1 interrupt enabled p2h_msg 31 0 powerpc subsystem to pci host message interrupt - controls the asser- tion of a pci interrupt from the p2h_message register. 0 interrupt disabled 1 interrupt enabled
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 389 of 554 10.8.4 powerpc subsystem to pci host message resource (p2h_msg_resource) register the powerpc subsystem to pci host message resource register is accessible from the plb. the powerpc subsystem uses this register to obtain message buffers in pci address space for messages the powerpc subsystem sends to the pci host processor. the pci host writes the starting pci address value of a message buffer into the p2h_msg_resource register. writing to this register sets the p2h_bufr_valid flag found in the message status register (see 10.8.23 message status (msg_status) register on page 409) and reading the p2h_msg_resource register clears this flag. 10.8.5 powerpc subsystem to host message address (p2h_msg_addr) register the powerpc subsystem to pci host message address register is accessible from the plb and is used by the powerpc subsystem to send messages to the pci host processor. the value written into this register is the pci address at which the message begins. writing to this register sets the p2h_msg bit of the pci_interr_status register. when the corresponding bit of the pci_interr_ena register is set to ? 1 ? ,theinta# signal of the pci bus is activated. the p2h_msg bit of the pci_interr_status register is reset when the inter- rupt service routine reads the p2h_msg_addr register. access type read/write base address (plb) x ? 7801 0010 ? p2h_msg_resource 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description p2h_msg_resource 0:31 x ? 0000 0000 ? powerpc subsystem to pci host message resource value, written with the pci starting address of a message buffer. writing this register sets to 1 the p2h_bufr_valid flag found in the message status register (see 10.8.23 message status (msg_status) register on page 409). reading this register sets to 0 the p2h_bufr_valid flag. access type read/write base address (plb) x ? 7801 0018 ? p2h_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description p2h_msg_addr 0:31 x ? 0000 0000 ? powerpc subsystem to pci host message address value, indicates the pci starting address of a message.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 390 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.6 powerpc subsystem to host doorbell (p2h_doorbell) register the powerpc subsystem to pci host doorbell register is accessible from the plb and is used by the powerpc subsystem to signal interrupts to the pci host processor. the powerpc subsystem has read and sum write access to this register. the data contains the mask used to access this register. when bits of this register are set to ? 1 ? and the corresponding bit of the pci_interr_ena register is set to ? 1 ? , the inta# signal of the pci bus is activated. the pci host processor reads this register to determine which of the doorbells have been activated. the pci host processor has read and rum write access to this register using a different plb address value. access type power pc x ? 7801 0020 ? host x ? 7801 0028 ? base address (plb) power pc read/set-under-mask host read/reset-under-mask p2h_msg_doorbell 31 p2h_msg_doorbell 30 p2h_msg_doorbell 29 p2h_msg_doorbell 28 p2h_msg_doorbell 27 p2h_msg_doorbell 26 p2h_msg_doorbell 25 p2h_msg_doorbell 24 p2h_msg_doorbell 23 p2h_msg_doorbell 22 p2h_msg_doorbell 21 p2h_msg_doorbell 20 p2h_msg_doorbell 19 p2h_msg_doorbell 18 p2h_msg_doorbell 17 p2h_msg_doorbell 16 p2h_msg_doorbell 15 p2h_msg_doorbell 14 p2h_msg_doorbell 13 p2h_msg_doorbell 12 p2h_msg_doorbell 11 p2h_msg_doorbell 10 p2h_msg_doorbell 9 p2h_msg_doorbell 8 p2h_msg_doorbell 7 p2h_msg_doorbell 6 p2h_msg_doorbell 5 p2h_msg_doorbell 4 p2h_msg_doorbell 3 p2h_msg_doorbell 2 p2h_msg_doorbell 1 p2h_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description p2h_msg_doorbell 31 0 0 powerpc subsystem to pci host doorbell - indicates which of the 32 pos- sible doorbells have been activated. 0 not activated 1activated p2h_msg_doorbell 30:1 1:30 0 for all p2h_msg_doorbell 0 31 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 391 of 554 10.8.7 host to powerpc subsystem message address (h2p_msg_addr) register the pci host to powerpc subsystem message address register is accessible from the plb and is used by the pci host to send messages to the powerpc subsystem ? s processor. the value written into this register is a message ? s pci starting address. writing to this register activates the h2p_msg_interr input to the powerpc uic. when this interrupt is enabled, an interrupt to the powerpc subsystem is generated. reading this register resets the h2p_msg_interr input to the uic. reading this register at the alternate plb address resets the message status register ? s h2p_msg_busy bit (see 10.8.23 message status (msg_status) register on page 409). access type alternate read (additional actions occur when reading this register using this address. please see above description for details.) primary read/write base address (plb) alternate x ? 7801 0050 ? primary x ? 7801 0060 ? h2p_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description h2p_msg_addr 0:31 x ? 0000 0000 ? the value is a message ? s pci starting address.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 392 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.8 host to powerpc subsystem doorbell (h2p_doorbell) register the pci host to powerpc subsystem doorbell (h2p_doorbell) register is accessible from the plb and is used by the pci host processor to signal interrupts to the powerpc subsystem. the pci host processor has read and sum write access to this register. the data contains the mask used to access this register. when any of this register ? sbitsaresetto ? 1 ? , an interrupt signal of the powerpc uic is activated. the powerpc subsystem reads this register to determine which of the doorbells have been activated. the powerpc subsystem has read and rum write access to this register using a different plb address value. access type host read/set-under-mask write powerpc read/reset-under-mask write base address (plb) host x ? 7801 0030 ? powerpc x ? 7801 0038 ? h2p_doorbells h2p_msg_doorbell 31 h2p_msg_doorbell 30 h2p_msg_doorbell 29 h2p_msg_doorbell 28 h2p_msg_doorbell 27 h2p_msg_doorbell 26 h2p_msg_doorbell 25 h2p_msg_doorbell 24 h2p_msg_doorbell 23 h2p_msg_doorbell 22 h2p_msg_doorbell 21 h2p_msg_doorbell 20 h2p_msg_doorbell 19 h2p_msg_doorbell 18 h2p_msg_doorbell 17 h2p_msg_doorbell 16 h2p_msg_doorbell 15 h2p_msg_doorbell 14 h2p_msg_doorbell 13 h2p_msg_doorbell 12 h2p_msg_doorbell 11 h2p_msg_doorbell 10 h2p_msg_doorbell 9 h2p_msg_doorbell 8 h2p_msg_doorbell 7 h2p_msg_doorbell 6 h2p_msg_doorbell 5 h2p_msg_doorbell 4 h2p_msg_doorbell 3 h2p_msg_doorbell 2 h2p_msg_doorbell 1 h2p_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name plb bit(s) reset description h2p_msg_doorbell 31 0 0 pci host to powerpc subsystem doorbell - indicates which of the 32 pos- sible doorbells have been activated. 0 not activated 1activated h2p_msg_doorbell 30:1 1:30 0 for all h2p_msg_doorbell 0 31 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 393 of 554 10.8.9 mailbox communications between powerpc subsystem and epc communication between the powerpc subsystem and the epc is accomplished by writing message data into buffers in the powerpc dram (d6) and signalling the destination processor with an interrupt. the powerpc software manages message data buffers for powerpc subsystem to epc (p2e) messages and epc to powerpc subsystem (e2p) messages. message data buffers are allocated to the epc by writing the buffers ? starting address into the e2p_msg_resource register. writing to this register sets the e2p_bufr_valid flag in the msg_status register. this flag indicates to the epc that the buffer is valid and can be used by the epc. the epc reads the e2p_bufr_valid value when a message buffer is required. the epc then reads the e2p_msg_resource register via the cab interface to obtain the address of the message buffer and the e2p_bufr_valid indicator bit is reset. by polling this indicator bit, the powerpc subsystem ? s processor knows when to replenish the e2p_msg_resource register with a new buffer address value. having acquired a message data buffer, the epc composes a message in the buffer and writes the buffer ? s starting address value into the e2p_msg_addr register. writing to this register generates an interrupt signal to the powerpc uic. the powerpc subsystem reads this register to find and process the message and, due to the read, the interrupt condition is cleared. there is no need for a p2e_msg_resource register because the powerpc software manages the message data buffers the powerpc subsystem composes a message in one of the data buffers and writes the starting address of the buffer into the p2e_msg_addr register. writing to this register produces an interrupt to the epc and sets the p2e_msg_busy bit of the msg_status register. as long as this flag is set, the epc is processing the message buffer. the epc reads the p2e_msg_addr register to locate the buffer in the powerpc dram. the epc resets the p2e_msg_busy bit by reading the p2e_msg_address register at an alternate cab address when message processing is complete. the powerpc subsystem will poll the busy flag to determine when the buffer can be reused.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 394 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.10 epc to powerpc subsystem resource (e2p_msg_resource) register the powerpc subsystem accesses the epc to powerpc subsystem message resource register from the plb while the epc accesses this register from its cab interface. the epc uses this register to obtain message buffers in the powerpc dram address space for messages it sends to the powerpc processor. the powerpc processor writes the starting dram address value of a message buffer. writing to this register sets the e2p_bufr_valid flag in the msg_status register (see 10.8.23 message status (msg_status) register on page 409). reading the e2p_msg_resource register from the cab resets this flag. cab view plb view cab view plb view access type cab read (see above description for additional actions that occur during a read of this register using this address.) plb read/write (see above description for additional actions that occur during a read of this register using this address.) base address cab x ? 3801 0010 ? plb x ? 7801 0040 ? e2p_msg_resource 313029282726252423222120191817161514131211109876543210 e2p_msg_resource 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2p_msg_resource 31:0 x ? 0000 0000 ? epc to powerpc subsystem message resource - written with the pow- erpc dram starting address of a message buffer. field name bit(s) reset description e2p_msg_resource 0:31 x ? 0000 0000 ? epc to powerpc subsystem message resource - written with the pow- erpc dram starting address of a message buffer.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 395 of 554 10.8.11 epc to powerpc subsystem message address (e2p_msg_addr) register the powerpc subsystem accesses the epc to powerpc subsystem message address register from the plb while the epc accesses this register from its cab interface. the epc uses this register to send messages to the powerpc processor. the value written into this register is the powerpc dram address at which the message begins. writing to this register sets e2p_msg_interr input to the powerpc uic. when the uic is configured to enable this input, an interrupt signal to the powerpc processor is activated. reading the e2p_msg_addr register via the plb address resets the e2p_msg_interr input to the powerpc uic. cab view plb view cab view plb view access type cab read/write plb read base address cab x ? 3801 0020 ? plb x ? 7801 0048 ? e2p_msg_addr 313029282726252423222120191817161514131211109876543210 e2p_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2p_msg_addr 0:31 x ? 0000 0000 ? epc to powerpc subsystem message address - indicates the pci start- ing address of a message. field name bit(s) reset description e2p_msg_addr 0:31 x ? 0000 0000 ? epc to powerpc subsystem message address - indicates the pci start- ing address of a message.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 396 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.12 epc to powerpc subsystem doorbell (e2p_doorbell) register the powerpc subsystem accesses the epc to powerpc subsystem doorbell register from the plb while the epc accesses this register from the cab interface. the epc uses this register to signal interrupts to the powerpc subsystem. the epc has read and sum write access to this register using a cab address value. the data contains the mask used to access this register. when any of this register ? sbitsaresetto ? 1 ? an inter- rupt signal to the powerpc uic is activated. the powerpc subsystem reads this register to determine which of the doorbells have been activated.the powerpc subsystem has read and rum write access to this register using a plb address value. cab view plb view cab view access type cab read/set-under-mask write plb read/reset-under-mask write base address cab x ? 3801 0040 ? plb x ? 7801 0058 ? e2p_doorbells e2p_msg_doorbell 31 e2p_msg_doorbell 30 e2p_msg_doorbell 29 e2p_msg_doorbell 28 e2p_msg_doorbell 27 e2p_msg_doorbell 26 e2p_msg_doorbell 25 e2p_msg_doorbell 24 e2p_msg_doorbell 23 e2p_msg_doorbell 22 e2p_msg_doorbell 21 e2p_msg_doorbell 20 e2p_msg_doorbell 19 e2p_msg_doorbell 18 e2p_msg_doorbell 17 e2p_msg_doorbell 16 e2p_msg_doorbell 15 e2p_msg_doorbell 14 e2p_msg_doorbell 13 e2p_msg_doorbell 12 e2p_msg_doorbell 11 e2p_msg_doorbell 10 e2p_msg_doorbell 9 e2p_msg_doorbell 8 e2p_msg_doorbell 7 e2p_msg_doorbell 6 e2p_msg_doorbell 5 e2p_msg_doorbell 4 e2p_msg_doorbell 3 e2p_msg_doorbell 2 e2p_msg_doorbell 1 e2p_msg_doorbell 0 313029282726252423222120191817161514131211109876543210 e2p_doorbells e2p_msg_doorbell 31 e2p_msg_doorbell 30 e2p_msg_doorbell 29 e2p_msg_doorbell 28 e2p_msg_doorbell 27 e2p_msg_doorbell 26 e2p_msg_doorbell 25 e2p_msg_doorbell 24 e2p_msg_doorbell 23 e2p_msg_doorbell 22 e2p_msg_doorbell 21 e2p_msg_doorbell 20 e2p_msg_doorbell 19 e2p_msg_doorbell 18 e2p_msg_doorbell 17 e2p_msg_doorbell 16 e2p_msg_doorbell 15 e2p_msg_doorbell 14 e2p_msg_doorbell 13 e2p_msg_doorbell 12 e2p_msg_doorbell 11 e2p_msg_doorbell 10 e2p_msg_doorbell 9 e2p_msg_doorbell 8 e2p_msg_doorbell 7 e2p_msg_doorbell 6 e2p_msg_doorbell 5 e2p_msg_doorbell 4 e2p_msg_doorbell 3 e2p_msg_doorbell 2 e2p_msg_doorbell 1 e2p_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2p_msg_doorbell 31 31 0 epc to powerpc subsystem doorbell - indicates which of the 32 possible doorbells is activated. 0 not activated 1activated e2p_msg_doorbell 30:1 30:1 0 for all e2p_msg_doorbell 0 0 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 397 of 554 plb view field name bit(s) reset description e2p_msg_doorbell 31 0 0 epc to powerpc subsystem doorbell - indicates which of the 32 possible doorbells is activated. 0 not activated 1 activated e2p_msg_doorbell 30:1 1:30 0 for all e2p_msg_doorbell 0 31 0
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 398 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.13 epc interrupt vector register the epc contains an interrupt vector 2 register which is accessible from the cab interface. this register records the source of epc interrupts generated by the np4gs3. this register ? s bits are set by hardware and are read by epc software to determine the source of interrupts. 10.8.14 epc interrupt mask register the epc contains a interrupt mask 2 register which is accessible from the cab interface. this register enables epc interrupts from sources within the np4gs3.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 399 of 554 10.8.15 powerpc subsystem to epc message address (p2e_msg_addr) register the powerpc subsystem accesses the powerpc subsystem to epc message address register from the plb, while the epc accesses this register from the cab interface. this register is used by the powerpc subsystem to send messages to the epc. the value written into this register is the powerpc dram address at which the message begins. writing to this register sets the p2e_msg_interr signal to the epc and the p2e_msg_busy bit of the msg_status register. reading the p2e_msg_addr register from the cab will reset the p2e_msg_interr signal to the epc. reading the p2e_msg_addr from an alternate cab address will reset the p2e_msg_busy bit of the msg_status register (see 10.8.23 message status (msg_status) register on page 409). cab view plb view cab view plb view access type primary read alternate read (additional actions occur when reading this register using this address. please see above description for details.) plb read/write base address primary x ? 3801 0080 ? alternate x ? 3802 0010 ? plb x ? 7801 0068 ? p2e_msg_addr 313029282726252423222120191817161514131211109876543210 p2e_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description p2e_msg_addr 31:0 x ? 0000 0000 ? powerpc subsystem to epc message address - indicates a message ? s powerpc dram starting address. field name bit(s) reset description p2e_msg_addr 0:31 x ? 0000 0000 ? powerpc subsystem to epc message address - indicates a message ? s powerpc dram starting address.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 400 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.16 powerpc subsystem to epc doorbell (p2e_doorbell) register the powerpc subsystem accesses the powerpc subsystem to epc doorbell register from the plb, while the epc accesses this register from the cab interface. the powerpc subsystem uses this register to signal interrupts to the epc. the powerpc subsystem has read and sum write access to this register. the data contains the mask used to access this register. when any of this register ? sbitsaresetto ? 1 ? , an interrupt signal of the epc is activated. this register is read by the epc to determine which doorbells have been acti- vated. the epc has read and rum write access to this register using a cab address value. cab view plb view cab view access type cab read/reset-under-mask write plb read/set-under-mask write base address cab x ? 3801 0100 ? plb x ? 7801 0070 ? p2e_doorbells p2e_msg_doorbell 31 p2e_msg_doorbell 30 p2e_msg_doorbell 29 p2e_msg_doorbell 28 p2e_msg_doorbell 27 p2e_msg_doorbell 26 p2e_msg_doorbell 25 p2e_msg_doorbell 24 p2e_msg_doorbell 23 p2e_msg_doorbell 22 p2e_msg_doorbell 21 p2e_msg_doorbell 20 p2e_msg_doorbell 19 p2e_msg_doorbell 18 p2e_msg_doorbell 17 p2e_msg_doorbell 16 p2e_msg_doorbell 15 p2e_msg_doorbell 14 p2e_msg_doorbell 13 p2e_msg_doorbell 12 p2e_msg_doorbell 11 p2e_msg_doorbell 10 p2e_msg_doorbell 9 p2e_msg_doorbell 8 p2e_msg_doorbell 7 p2e_msg_doorbell 6 p2e_msg_doorbell 5 p2e_msg_doorbell 4 p2e_msg_doorbell 3 p2e_msg_doorbell 2 p2e_msg_doorbell 1 p2e_msg_doorbell 0 313029282726252423222120191817161514131211109876543210 p2e_doorbells p2e_msg_doorbell 31 p2e_msg_doorbell 30 p2e_msg_doorbell 29 p2e_msg_doorbell 28 p2e_msg_doorbell 27 p2e_msg_doorbell 26 p2e_msg_doorbell 25 p2e_msg_doorbell 24 p2e_msg_doorbell 23 p2e_msg_doorbell 22 p2e_msg_doorbell 21 p2e_msg_doorbell 20 p2e_msg_doorbell 19 p2e_msg_doorbell 18 p2e_msg_doorbell 17 p2e_msg_doorbell 16 p2e_msg_doorbell 15 p2e_msg_doorbell 14 p2e_msg_doorbell 13 p2e_msg_doorbell 12 p2e_msg_doorbell 11 p2e_msg_doorbell 10 p2e_msg_doorbell 9 p2e_msg_doorbell 8 p2e_msg_doorbell 7 p2e_msg_doorbell 6 p2e_msg_doorbell 5 p2e_msg_doorbell 4 p2e_msg_doorbell 3 p2e_msg_doorbell 2 p2e_msg_doorbell 1 p2e_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description p2e_msg_doorbell 31 31 0 powerpc subsystem to epc doorbell. indicates which of the 32 possible doorbells is activated. 0 not activated 1activated p2e_msg_doorbell 30:1 30:1 0 for all p2e_msg_doorbell 0 0 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 401 of 554 plb view 10.8.17 mailbox communications between pci host and epc communication between the pci host processor and the epc is accomplished by writing message data into buffers in the powerpc dram and signalling the destination processor with an interrupt. the pci host processor software manages message data buffers for pci host processor to epc (h2e) messages and epc to pci host processor (e2h) messages. message data buffers are allocated to the epc by writing the buffers ? starting address into the e2h_msg_resource register. writing this register sets the e2h_bufr_valid flag in the msg_status register. this flag indicates to the epc that the buffer is valid and can be used by the epc. the epc reads the e2h_bufr_valid value when a message buffer is required. the epc then reads the e2h_msg_resource register via the cab interface and the e2p_bufr_valid indicator bit is reset. by polling this indicator bit, the pci host processor knows when to replenish the e2h_msg_resource register with a new buffer address value. having acquired a message data buffer, the epc will compose a message in the buffer and write the buffer ? s starting address value into the e2h_msg_addr register. writing to this register generates an interrupt to the pci host processor. the pci host processor reads this register to find and process the message. reading this register clears the interrupt condition. since the pci host processor software manages the message data buffers, there is no need for an h2e_msg_resource register. the pci host processor composes a message in one of the data buffers and writes the starting address of the buffer into the h2e_msg_addr register. writing to this register produces an interrupt input signal to the epc and sets the h2e_msg_busy bit of the msg_status register. as long as this flag is set, the epc is processing the message buffer. the epc reads the h2e_msg_addr register to locate the buffer in the powerpc dram and reset the interrupt input signal to the epc. the epc resets the h2e_msg_busy bit by reading the p2e_msg_addr register from an alternate cab address when message processing is complete. the powerpc subsystem will poll the h2e_msg_busy bit to determine when the buffer can be reused. field name bit(s) reset description p2e_msg_doorbell 31 0 0 powerpc subsystem to epc doorbell. indicates which of the 32 possible doorbells is activated. 0 not activated 1 activated p2e_msg_doorbell 30:1 1:30 0 for all p2e_msg_doorbell 0 31 0
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 402 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.18 epc to pci host resource (e2h_msg_resource) register the pci host processor accesses the epc to pci host message resource register from the plb while the epc accesses this register from its cab interface. this epc uses this register to obtain message buffers in the powerpc dram address space for messages it sends to the pci host processor. the pci host processor writes the starting dram address value of a message buffer into the e2p_msg_resource register. writing to this register sets the e2h_bufr_valid flag found in the message status register (see 10.8.23 message status (msg_status) register on page 409). reading the e2h_msg_resource register from the cab resets this flag. cab view plb view cab view plb view access type cab read (additional actions occur when reading this register using this address. please see above description for details.) plb read/write base address cab x ? 3801 0200 ? plb x ? 7801 0080 ? e2h_msg_resource 313029282726252423222120191817161514131211109876543210 e2h_msg_resource 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2h_msg_resource 31:0 x ? 0000 0000 ? epc to pci host message resource - written with the powerpc dram starting address of a message buffer. when reading this register via the cab, the e2h_bufr_valid flag found in the message status register (see 10.8.23 message status (msg_status) register on page 409) is set to 0. field name bit(s) reset description e2h_msg_resource 0:31 x ? 0000 0000 ? epc to pci host message resource - written with the powerpc dram starting address of a message buffer. when writing this register, the e2h_bufr_valid flag is set to 1.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 403 of 554 10.8.19 epc to pci host message address (e2h_msg_addr) register the pci host processor accesses the epc to pci host message address register from the plb while the epc accesses this register from its cab interface. the epc uses this register to send messages to the pci host processor. the value written into this register is the powerpc dram address at which the message begins. writing to this register sets the e2h_msg bit of the pci_interr_status register. when the corre- sponding bit of the pci_interr_ena register is set to ? 1 ? , an interrupt signal to the pci host processor is acti- vated. reading the e2h_msg_addr register from the plb resets the e2h_msg bit of the pci_interr_status register. cab view plb view cab view plb view access type cab read/write plb read base address cab x ? 3801 0400 ? plb x ? 7801 0088 ? e2h_msg_addr 313029282726252423222120191817161514131211109876543210 e2h_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2h_msg_addr 31:0 x ? 0000 0000 ? epc to pci host message address - indicates the powerpc dram start- ing address of a message. field name bit(s) reset description e2h_msg_addr 0:31 x ? 0000 0000 ? epc to pci host message address - indicates the powerpc dram start- ing address of a message.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 404 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.20 epc to pci host doorbell (e2h_doorbell) register the pci host processor accesses the epc to pci host doorbell register from the plb while the epc accesses this register from the cab interface. the epc uses this register to signal interrupts to the pci host processor. the epc has read and sum write access to this register using a cab address value. the mask used to access this register is contained in the data. when any of this register ? sbitsaresetto ? 1 ? and the corresponding bit of the pci_interr_ena register is set to ? 1 ? , an interrupt signal of the pci host processor is activated. this register is read by the pci host processor to determine which of the doorbells have been acti- vated. the pci host processor has read and rum write and read access to this register using a plb address value. cab view plb view cab view access type cab read/set_under_mask write plb read/reset_under_mask write base address cab x ? 3801 0800 ? plb x ? 7801 0098 ? e2h_doorbells e2h_msg_doorbell 31 e2h_msg_doorbell 30 e2h_msg_doorbell 29 e2h_msg_doorbell 28 e2h_msg_doorbell 27 e2h_msg_doorbell 26 e2h_msg_doorbell 25 e2h_msg_doorbell 24 e2h_msg_doorbell 23 e2h_msg_doorbell 22 e2h_msg_doorbell 21 e2h_msg_doorbell 20 e2h_msg_doorbell 19 e2h_msg_doorbell 18 e2h_msg_doorbell 17 e2h_msg_doorbell 16 e2h_msg_doorbell 15 e2h_msg_doorbell 14 e2h_msg_doorbell 13 e2h_msg_doorbell 12 e2h_msg_doorbell 11 e2h_msg_doorbell 10 e2h_msg_doorbell 9 e2h_msg_doorbell 8 e2h_msg_doorbell 7 e2h_msg_doorbell 6 e2h_msg_doorbell 5 e2h_msg_doorbell 4 e2h_msg_doorbell 3 e2h_msg_doorbell 2 e2h_msg_doorbell 1 e2h_msg_doorbell 0 313029282726252423222120191817161514131211109876543210 e2h_doorbells e2h_msg_doorbell 31 e2h_msg_doorbell 30 e2h_msg_doorbell 29 e2h_msg_doorbell 28 e2h_msg_doorbell 27 e2h_msg_doorbell 26 e2h_msg_doorbell 25 e2h_msg_doorbell 24 e2h_msg_doorbell 23 e2h_msg_doorbell 22 e2h_msg_doorbell 21 e2h_msg_doorbell 20 e2h_msg_doorbell 19 e2h_msg_doorbell 18 e2h_msg_doorbell 17 e2h_msg_doorbell 16 e2h_msg_doorbell 15 e2h_msg_doorbell 14 e2h_msg_doorbell 13 e2h_msg_doorbell 12 e2h_msg_doorbell 11 e2h_msg_doorbell 10 e2h_msg_doorbell 9 e2h_msg_doorbell 8 e2h_msg_doorbell 7 e2h_msg_doorbell 6 e2h_msg_doorbell 5 e2h_msg_doorbell 4 e2h_msg_doorbell 3 e2h_msg_doorbell 2 e2h_msg_doorbell 1 e2h_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description e2h_msg_doorbell 31 31 0 epc to pci host doorbell - indicates which of the 32 possible doorbells is activated. 0 not activated 1activated e2h_msg_doorbell 30:1 30:1 0 for all e2h_msg_doorbell 0 0 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 405 of 554 plb view field name bit(s) reset description e2h_msg_doorbell 31 0 0 epc to pci host doorbell - indicates which of the 32 possible doorbells is activated. 0 not activated 1 activated e2h_msg_doorbell 30:1 1:30 0 for all e2h_msg_doorbell 0 31 0
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 406 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.21 pci host to epc message address (h2e_msg_addr) register the pci host processor accesses the pci host to epc message address register from the plb while the epc accesses this register from the cab interface. the pci host uses this register to send messages to the epc. the value written into this register is the powerpc dram address at which the message begins. writing to this register sets the h2e_msg_interr signal to the epc and the h2e_msg_busy bit of the msg_status register. reading the h2e_msg_addr register from the primary cab address will reset the h2e_msg_interr signal to the epc. reading the h2e_msg_addr from the alternate cab address will reset the h2e_msg_busy bit of the msg_status register (see 10.8.23 message status (msg_status) register on page 409). cab view plb view cab view plb view access type primary read alternate read (additional actions occur when reading this register using this address. please see above description for details.) plb read/write base address primary x ? 3801 1000 ? alternate x ? 3802 0020 ? plb x ? 7801 00a8 ? h2e_msg_addr 313029282726252423222120191817161514131211109876543210 h2e_msg_addr 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description h2e_msg_addr 31:0 x ? 0000 0000 ? pci host to epc message address - indicates a message ? s powerpc dram starting address. field name bit(s) reset description h2e_msg_addr 0:31 x ? 0000 0000 ? pci host to epc message address - indicates a message ? s powerpc dram starting address.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 407 of 554 10.8.22 pci host to epc doorbell (h2e_doorbell) register the pci host processor accesses the pci host to epc doorbell register from the plb while the epc accesses this register from the cab interface. the pci host processor uses this register to signal interrupts to the epc. the pci host processor has read and sum write access to this register. the data contains the mask used to access this register. when any of this register ? sbitsaresetto ? 1 ? , an interrupt signal of the epc is activated. this register is read by the epc to determine which of the doorbells have been activated.the epc has read and rum write access to this register using a cab address value. cab view plb view cab view access type cab read/reset_under_mask write plb read/set_under_mask write base address cab x ? 3801 2000 ? plb x ? 7801 00b0 ? h2e_doorbells h2e_msg_doorbell 31 h2e_msg_doorbell 30 h2e_msg_doorbell 29 h2e_msg_doorbell 28 h2e_msg_doorbell 27 h2e_msg_doorbell 26 h2e_msg_doorbell 25 h2e_msg_doorbell 24 h2e_msg_doorbell 23 h2e_msg_doorbell 22 h2e_msg_doorbell 21 h2e_msg_doorbell 20 h2e_msg_doorbell 19 h2e_msg_doorbell 18 h2e_msg_doorbell 17 h2e_msg_doorbell 16 h2e_msg_doorbell 15 h2e_msg_doorbell 14 h2e_msg_doorbell 13 h2e_msg_doorbell 12 h2e_msg_doorbell 11 h2e_msg_doorbell 10 h2e_msg_doorbell 9 h2e_msg_doorbell 8 h2e_msg_doorbell 7 h2e_msg_doorbell 6 h2e_msg_doorbell 5 h2e_msg_doorbell 4 h2e_msg_doorbell 3 h2e_msg_doorbell 2 h2e_msg_doorbell 1 h2e_msg_doorbell 0 313029282726252423222120191817161514131211109876543210 h2e_doorbells h2e_msg_doorbell 31 h2e_msg_doorbell 30 h2e_msg_doorbell 29 h2e_msg_doorbell 28 h2e_msg_doorbell 27 h2e_msg_doorbell 26 h2e_msg_doorbell 25 h2e_msg_doorbell 24 h2e_msg_doorbell 23 h2e_msg_doorbell 22 h2e_msg_doorbell 21 h2e_msg_doorbell 20 h2e_msg_doorbell 19 h2e_msg_doorbell 18 h2e_msg_doorbell 17 h2e_msg_doorbell 16 h2e_msg_doorbell 15 h2e_msg_doorbell 14 h2e_msg_doorbell 13 h2e_msg_doorbell 12 h2e_msg_doorbell 11 h2e_msg_doorbell 10 h2e_msg_doorbell 9 h2e_msg_doorbell 8 h2e_msg_doorbell 7 h2e_msg_doorbell 6 h2e_msg_doorbell 5 h2e_msg_doorbell 4 h2e_msg_doorbell 3 h2e_msg_doorbell 2 h2e_msg_doorbell 1 h2e_msg_doorbell 0 012345678910111213141516171819202122232425262728293031 field name bit reset description h2e_msg_doorbell 31 31 0 pci host to epc doorbell - indicates which of the 32 possible doorbells have been activated. 0 not activated 1 activated h2e_msg_doorbell 30:1 30:1 0 for all h2e_msg_doorbell 0 0 0
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 408 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 plb view field name plb bit reset description h2e_msg_doorbell 31 0 0 pci host to epc doorbell - indicates which of the 32 possible doorbells is activated. 0 not activated 1activated h2e_msg_doorbell 30:1 1:30 0 for all h2e_msg_doorbell 0 31 0
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 409 of 554 10.8.23 message status (msg_status) register the message status register provides status information associated with inter-processor messaging. this read only register is accessible from either the plb or the cab for the purpose of checking status associated with messaging. cab view plb view access type cab read plb read base address cab x ? 3801 4000 ? plb x ? 7801 00a0 ? reserved p2h_bufr_valid h2p_msg_busy e2p_bufr_valid p2e_msg_busy e2h_bufr_valid h2e_msg_busy 313029282726252423222120191817161514131211109876543210 reserved p2h_bufr_valid h2p_msg_busy e2p_bufr_valid p2e_msg_busy e2h_bufr_valid h2e_msg_busy 012345678910111213141516171819202122232425262728293031
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 410 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 cab view plb view field name cab bit(s) reset description reserved 31:6 reserved p2h_bufr_valid 5 0 powerpc subsystem to pci host message buffer valid indicator. 0 buffer not valid 1 buffer valid h2p_msg_busy 4 0 pci host to powerpc subsystem message busy indicator. 0 not busy processing h2p message 1 busy processing h2p message e2p_bufr_valid 3 0 epc to powerpc subsystem message buffer valid indicator. 0 buffer not valid 1 buffer valid p2e_msg_busy 2 0 powerpc to epc subsystem message busy indicator. 0 not busy processing p2e message 1 busy processing p2e message e2h_bufr_valid 1 0 epc to pci host message buffer valid indicator. 0 buffer not valid 1 buffer valid h2e_msg_busy 0 0 pci host to epc message busy indicator. 0 not busy processing h2e message 1 busy processing h2e message field name bit(s) reset description reserved 0:25 reserved p2h_bufr_valid 26 0 powerpc subsystem to pci host message buffer valid indicator. 0 buffer not valid 1 buffer valid h2p_msg_busy 27 0 pci host to powerpc subsystem message busy indicator. 0 not busy processing h2p message 1 busy processing h2p message e2p_bufr_valid 28 0 epc to powerpc subsystem message buffer valid indicator. 0 buffer not valid 1 buffer valid p2e_msg_busy 29 0 powerpc subsystem to epc message busy indicator. 0 not busy processing p2e message 1 busy processing p2e message e2h_bufr_valid 30 0 epc to pci host message buffer valid indicator. 0 buffer not valid 1 buffer valid h2e_msg_busy 31 0 pci host to epc message busy indicator. 0 not busy processing h2e message 1 busy processing h2e message
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 411 of 554 10.8.24 powerpc boot redirection instruction registers (boot_redir_inst) in system implementations in which the embedded powerpc subsystem boots from the d6 dram, the mail- box and dram interface macro performs powerpc boot address redirection. under these conditions, the hardware provides instructions that redirect the boot sequence to a location in the powerpc dram. storage for eight instructions is provided by the boot_redir_inst registers. the powerpc boot redirection instruction (boot_redir_inst) registers are accessed from the cab interface. these registers provide capacity for eight instructions for plb addresses x ? ffff ffe0 ? -x ? ffff fffc ? these instructions redirect the powerpc subsystem to boot from a location in the powerpc dram and are configured before the ppc_reset is cleared. access type read/write base address x ? 3800 0110 ? -x ? 3800 0117 ? boot_redir_inst 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description boot_redir_inst 31:0 x ? 0000 0000 ? powerpc boot redirection instruction address values contains instructions used by the powerpc subsystem to redirect the boot sequence to a loca- tion in d6 dram. offset corresponding powerpc instruction address 0x ? ffff ffe0 ? 1x ? ffff ffe4 ? 2x ? ffff ffe8 ? 3x ? ffff ffec ? 4x ? ffff fff0 ? 5x ? ffff fff4 ? 6x ? ffff fff8 ? 7x ? ffff fffc ?
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 412 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.25 powerpc machine check (pwrpc_mach_chk) register this register is accessible on the cab and indicates the status of the machine check output of the powerpc processor core. access type read base address (cab) x ? 3800 0210 ? reserved mach_chk 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved mach_chk 0 0 powerpc machine check indication. 0 no machine check has occurred 1 a machine check has occurred
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 413 of 554 10.8.26 parity error status and reporting when reading the powerpc subsystem memory, any parity errors encountered are recorded and actions are taken. powerpc subsystem memory parity errors are indicated for each double word returned on the plb regardless of the object size read. for this reason, parity must be initialized for every double word in each memory region used. furthermore, for powerpc processor code regions, correct parity must be initialized for each cache line in the region. parity will be set correctly when the memory is written. therefore, for code regions, this parity initialization is accomplished automatically when the code is loaded. likewise, this initial- ization is performed if a diagnostic is performed on the memory that writes to every location. in all other cases, this parity initialization must be performed for each region of memory used prior to use. since even bytes and odd bytes of the powerpc subsystem memory are stored in physically distinct mod- ules, error statistics are collected on an odd and even byte basis. this information may prove useful in isola- tion of a failing part. the word address value of the most recent parity error is stored in the slave error address register (sear). in addition, bits of the slave error status register (sesr) indicate which plb masters have experienced parity errors in reading the powerpc subsystem memory. this status information is indicated by odd and even byte for each of the three plb masters (instruction cache unit, data cache unit, and pci/plb macro). bits of the sesr are used to generate an interrupt input to the uic. if any of these bits are set to b ? 1 ? then the parity error interrupt input to the uic is set to b ? 1 ? . when the uic is configured to enable this interrupt input, an interrupt signal is generated to the ppc405 core for processing. as a part of the interrupt service routine, the sesr must be read to clear the source of the interrupt. 10.8.27 slave error address register (sear) the slave error address register records the word address value of the last occurrence of a parity error encountered by the dram interface slave unit. this address value will isolate the location of the parity error to within a word. access type read base address (plb) x ? 7801 00b8 ? error address 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description error address 0:31 x ? 0000 0000 ? last plb word address value that resulted in a parity error when using the dram interface slave unit.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 414 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.8.28 slave error status register (sesr) the slave error status register contains status information that indicates which masters have encountered parity errors in reading from the dram and whether these errors occurred in a byte with an even (perr_byte0) or an odd (perr_byte1) address value. the powerpc subsystem has three plb masters: the instruction cache unit (icu), the data cache unit (dcu), and the pci macro. the contents of this register are reset to x ? 0000 0000 ? when read. access type read and reset base address (plb) x ? 7801 00c0 ? icu_error_status dcu_error_status pci_error_status reserved perr_byte0 perr_byte1 reserved perr_byte0 perr_byte1 reserved perr_byte0 perr_byte1 reserved 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description reserved 0:5 reserved perr_byte0 6 0 byte 0 parity error indication - whether or not the instruction cache unit plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered perr_byte1 7 0 byte 1 parity error indication - whether or not the instruction cache unit plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered reserved 8:13 reserved perr_byte0 14 0 byte 0 parity error indication - whether or not the data cache unit plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered perr_byte1 15 0 byte 1 parity error indication - whether or not the data cache unit plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered reserved 16:21 reserved perr_byte0 22 0 byte 0 parity error indication - whether or not the pci macro plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered perr_byte1 23 0 byte 1 parity error indication - whether or not the pci macro plb master encountered a parity error since the register was last read. 0 no parity error encountered 1 parity error encountered reserved 24:31 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 415 of 554 10.8.29 parity error counter (perr_count) register the parity error counter register contains two 16-bit counter values. these counters accumulate the total number of parity errors for even and odd byte addresses since the last time the chip was reset. these counters roll over to zero when their maximum value is reached (65535). access type read base address (plb) x ? 7801 00c8 ? byte0_perr_count byte1_perr_count 012345678910111213141516171819202122232425262728293031 field name bit(s) reset description byte0_perr_count 0:15 x ? 0000 ? count of byte 0 parity errors encountered when reading dram from dram interface slave unit. byte1_perr_count 16:31 x ? 0000 ? count of byte 1parity errors encountered when reading dram from dram interface slave unit.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 416 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.9 system start-up and initialization 10.9.1 np4gs3 resets a general reset is performed whenever the np4gs3 ? s reset input pin is activated. a general reset includes the powerpc core, and the np4gs3 network function reset domains. the powerpc core and the np4gs3 network function reset domains are also controlled by separate configuration registers, powerpc_reset and soft_reset, respectively. each of these registers are accessible via the np4gs3 ? s cab interface. a general reset sets the powerpc_reset to ? 1 ? (activated), but the soft_reset will be set to ? 0 ? (deactivated). when the general reset is deactivated, the powerpc core remains in reset and the powerpc macros and the np4gs3 network function are in an idle condition with state machines active and control structures un- initialized. the np4gs3 network function will be functional once epc picocode has been loaded and control structures are initialized. the epc is activated automatically when loading boot picocode from the spm inter- face or by setting the bd_override bit to ? 1 ? in the boot override register (see 13.13.4 boot override register (boot_override) on page 463). the powerpc core is activated by clearing the bit of the powerpc_reset register. the powerpc core and the np4gs3 network function domains are separately reset and activated by setting and clearing their respective reset registers. setting the control bit in the soft_reset register causes the network function to be momentarily reset while the powerpc subsystem is held in reset. the powerpc_reset is also activated whenever the watch dog timer of the powerpc core expires for a second time and watch dog reset enable (wd_reset_ena) is set to ? 1 ? . when enabled, the second expiration of the watch dog timer results in a pulse on the serr# signal of the pci bus. the tcr [wrc] of the powerpc core is set to ? 10 ? to generate a chip reset request on the second expiration of the watch dog timer. the powerpc core can be reset from the riscwatch debugger environment. a core reset performed from riscwatch resets the ppc405 core, but does not set the powerpc_reset control register. this allows a momentary reset of the powerpc core and does not require a release of the powerpc core by clearing the powerpc_reset control register. the powerpc core is also reset and the powerpc_reset control register is set when a ? chip reset ? command is performed from riscwatch and the watch dog reset enable is set to ? 1 ? . this resets the powerpc core and holds it in reset until the powerpc_reset register is cleared. table 10-1. reset domains reset domain applies powerpc core applies only to the ppc405 processor core and is used to control its start-up during system initializa- tion network function applies to all other functions in the np4gs3
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 417 of 554 10.9.2 systems initialized by external pci host processors system implementations with control function centralized in a pci host processor are initialized primarily by an external pci host processor. these systems do not use the spm interface or its associated flash memory to boot the epc and dasl picocode (boot_picocode = '1'). the host processor (boot_ppc = '0') loads code for the powerpc processor and the epc and dasl picoprocessors. the general sequence of the start-up and initialization is as follows: 1. the host processor boots and performs blade level configuration and testing while holding blade sub- systems in reset. 2. the host processor releases each of the blade subsystems ? reset signals in sequence. 3. release of the general reset input pin of the np4gs3 activates its state machines. the np4gs3 is in an idle state with un-initialized structures and the ppc405 core remains in reset and host pci configuration is enabled. 4. the host system uses the pci configuration protocol to configure internal registers of the pci interface macro. this configuration sets the base address values of the pci target maps. the pci master maps are disabled. 5. the host processor uses the appropriate pci target address values to configure and initialize the np4gs3 ? s dram controllers via the cab interface. 6. the host processor uses the appropriate pci target address values to load the powerpc code into one of the np4gs3 ? s drams. this code includes the branch code loaded into the boot_redir_inst registers. 7. the host loads the picocode memories of the epc and dasl. these memories are accessible via the cab interface macro. the addresses of the registers controlling this interface were mapped to pci addresses in step 4. 8. the host processor starts the ppc405 core by clearing the powerpc_reset configuration register. this register is accessed via the cab interface. the powerpc subsystem starts by fetching the instruction at plb address x ? ffff fffc ? . this address is decoded by hardware to provide an unconditional branch to instruction space in the powerpc dram memory. 9. the host processor uses the cab interface to set the bd_override bit to ? 1 ? in the boot override register (see 13.13.4 boot override register (boot_override) on page 463) to start the epc code. this code controls the initialization of the np4gs3 structures in preparation for network traffic. alternatively, this ini- tialization is performed by the host processor via the cab interface. 10. communication ports are configured and enabled by either the host processor or the ppc405 core.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 418 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.9.3 systems with pci host processors and initialized by powerpc subsystem the powerpc subsystem primarily controls start-up and initialization sequences of systems of this category. these systems do not use the spm interface or its associated flash memory for booting the epc boot pico- code (boot_picocode = '1'). the powerpc subsystem boots from pci address space (boot_ppc = '1') and loads code for the powerpc processor and the epc and dasl picoprocessors. the general sequence of the start-up and initialization is as follows: 1. the host processor boots and performs blade level configuration and testing while holding blade sub- systems in reset. 2. the host processor releases each of the blade subsystems ? reset signals in sequence. 3. release of the general reset input pin of the np4gs3 activates its state machines. the np4gs3 is in an idle state with un-initialized structures and the ppc405 core remains in reset and host pci configuration is enabled. 4. the host system uses the pci configuration protocol to configure internal registers of the pci interface macro. this configuration sets the base address values of the pci target maps. the pci master maps are disabled except for the pmm0 which maps plb addresses x ? fffe 0000 ? through x ? ffff ffff ? to the same addresses in pci address space. 5. the host processor starts the ppc405 core by clearing the powerpc_reset configuration register. this register is accessed via the cab interface. 6. the ppc405 core will boot from code residing in the pci address space starting at address x ? ffff fffc ? . 7. the ppc405 core processor configures and initializes the np4gs3 ? s dram controllers via the cab interface. 8. the ppc405 core processor loads the powerpc code, via the dram interface macro, into one of the np4gs3 ? s drams. 9. the ppc405 core processor loads the picocode for the epc and dasl via the cab interface. 10. using the cab interface, the host processor sets the bd_override bit to ? 1 ? inthebootoverrideregister (see 13.13.4 boot override register (boot_override) on page 463) to start the epc picocode. this pico- code will control the initialization of the np4gs3 structures in preparation for network traffic. alternatively, this initialization is performed by the ppc405 core processor via the cab interface. 11. communication ports are configured and enabled by the ppc405 core.
ibm powernp np4gs3 preliminary network processor np3_dl_sec10_eppc.fm.08 may 18, 2001 embedded powerpc ? subsystem page 419 of 554 10.9.4 systems without pci host processors and initialized by powerpc subsystem the epc initially controls start-up and initialization sequences of systems of this category, but they are prima- rily controlled by the powerpc subsystem. these systems use the spm interface and its associated flash memory for booting the epc picocode (boot_picocode = '0'). the powerpc subsystem boots from pci address space ( boot_ppc = '1') and loads code for the powerpc processor and the epc and dasl picopro- cessors. the general sequence of the start-up and initialization is as follows: 1. release of the general reset input pin of the np4gs3 activates its state machines. the pci master maps are disabled except for the pmm0 which maps plb addresses x ? fffe 0000 ? through x ? ffff ffff ? to the same addresses in pci address space. epc boot picocode is loaded from the flash memory via the spm interface. the spm interface automatically starts the epc and the ppc405 core remains in reset. 2. epc code executes diagnostic and initialization code which includes the initialization of the np4gs3 ? s dram controllers. 3. the epc starts the ppc405 core by clearing the powerpc_reset configuration register. this register is accessed via the cab interface. 4. the ppc405 core will boot from code residing in the pci address space starting at address x ? ffff fffc ? . 5. the ppc405 core processor loads the powerpc code via the dram interface macro into one of the np4gs3 ? s drams. 6. the ppc405 core processor or the epc loads the picocode for the dasl via the cab interface. 7. communication ports are configured and enabled by the ppc405 core.
ibm powernp np4gs3 network processor preliminary embedded powerpc ? subsystem page 420 of 554 np3_dl_sec10_eppc.fm.08 may 18, 2001 10.9.5 systems without pci host or delayed pci configuration and initialized by epc the epc controls start-up and initialization sequences of systems of this category. these systems use the spm interface and its associated flash memory for booting the epc picocode (boot_picocode = '0'). code for the powerpc subsystem and the epc are loaded by the spm interface and the epc (boot_ppc = '0'). code for the powerpc subsystem exists in the flash memory or is provided using guided traffic. the general sequence of the start-up and initialization is as follows: 1. release of the general reset input pin of the np4gs3 activates its state machines. epc picocode is loaded from the flash memory via the spm interface. the spm interface automatically starts the epc. the ppc405 core remains in reset and pci host configuration is disabled. 2. epc code executes diagnostic and initialization code, which includes the initialization of the np4gs3 ? s dram controllers. 3. the epc loads the code for the powerpc processor from the flash memory into the dram. this code includes the branch code loaded into the boot_redir_inst registers. alternatively, this code is loaded using guided traffic. guided traffic flows once the communications port connecting the source has been enabled (see step 6). 4. the epc starts the ppc405 core by clearing the powerpc_reset configuration register. this register is accessed via the cab interface. 5. the ppc405 core will boot from code residing in the plb address space starting at address x ? ffff fffc ? . this address is decoded by hardware to provide an unconditional branch to instruction space in the powerpc dram memory. 6. if desired, the ppc405 core can change the contents of the pci configuration header registers and enable pci host configuration. 7. communication ports are configured and enabled by the ppc405 core or, alternatively, by the epc.
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 421 of 554 11. reset and initialization this section provides, by example, a method for initializing the np4gs3. it is not intended to be exhaustive in scope, since there are many configurations and environments where the np4gs3 may be applied. the external serial parallel manager (spm) field programmable gate array (fpga) mentioned in this chapter is a component of the network processor evaluation kit and is not sold separately. 11.1 overview the np4gs3 supports a variety of system and adapter configurations. in order to support a particular envi- ronment, the network processor must be initialized with parameters that match that environment ? s system or adapter requirements. the sequence of reset and initialization is shown in table 11-1 . some system environments do not require all of the steps and some require that certain steps be performed differently. the np4gs3 supports the system environments shown in figure 11-1 for reset and initialization. table 11-1. reset and initialization sequence step action notes 1 set i/os device i/os set to match adapter requirements 2 reset must be held in reset for minimum of 1 s 3 boot several boot options 4 setup 1 low level hardware setup 5 diagnostics 1 memory and register diagnostics 6 setup 2 basic hardware setup 7 hardware initialization hardware self-initialization of various data structures 8 diagnostics 2 data flow diagnostics 9 operational everything ready for first guided frame 10 configure functional configuration 11 initialization complete everything ready for first data frame
ibm powernp np4gs3 network processor preliminary reset and initialization page 422 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 the following sections describe each step in the reset and initialization sequence and the various ways to accomplish each step. figure 11-1. system environments spm eeprom cabwatch host * external spm fpga * external eeprom * optional cabwatch spm cabwatch spm cabwatch * external spm fpga * external eeprom * optional cabwatch * external pci eeprom for powerpc * optional spm fpga * optional eeprom * optional cabwatch * optional pci eeprom for powerpc * external pci host pci pci ibm powernp ibm powernp ibm powernp eeprom eeprom eeprom eeprom
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 423 of 554 11.2 step 1: set i/os several np4gs3 i/os must be set to appropriate values to operate correctly. most of these i/os will be set to a fixed value at the card level, but some will be set to the appropriate value based on system parameters. the following table lists all configurable i/os that must be set prior to initial reset. additional configuration bits could be utilized by defining additional i/o on an external spm fpga as configu- ration inputs. if the user defines registers in the spm cab address space and a corresponding external spm fpga design, the np4gs3 can read these spm i/os and make configuration adjustments based on their values (for example, the type of cp interface might be 100 mb or 1 gb). custom boot picocode is required to take advantage of these additional features. all clocks must be operating prior to issuing a reset to the np4gs3. the clock_core, switch clock a, and switch clock b drive internal plls that must lock before the internal clock logic will release the operational signal. the clock125 and dmu_*(3) inputs, if applicable, should also be operating in order to properly reset the dmu interfaces. the pci clock (pci_clk) must be operating in order to properly reset the pci interface. table 11-2. set i/os checklist i/o values notes testmode(1:0) see table 2-30: miscellaneous pins on page 77 for the encoding of these i/o adapter may require mechanism to set test mode i/os in order to force various test scenarios. boot_picocode see table 2-30: miscellaneous pins on page 77 for the encoding of this i/o controls which interface loads the internal epc instruction memory and boots the guided frame handler thread. boot_ppc see table 2-30: miscellaneous pins on page 77 for the encoding of this i/o controls from where the internal ppc fetches its initial boot code. pci_speed see table 2-26: pci pins on page 72 for the encoding of this i/o controls the speed of the pci bus switch_bna see table 2-30: miscellaneous pins on page 77 for the encoding of this i/o initial value for primary switch interface cpdetect_a see table 2-15: pmm interface pin multi- plexing on page 55 for the encoding of this i/o used to find a locally attached control point function (cpf) cpdetect_b see table 2-15: pmm interface pin multi- plexing on page 55 for the encoding of this i/o used to find a locally attached cpf cpdetect_c see table 2-15: pmm interface pin multi- plexing on page 55 for the encoding of this i/o used to find a locally attached cpf cpdetect_d see table 2-15: pmm interface pin multi- plexing on page 55 for the encoding of this i/o used to find a locally attached cpf spare_tst_rcvr(9:0) see table 2-30: miscellaneous pins on page 77 for the correct tie values for these i/o used during manufacturing testing
ibm powernp np4gs3 network processor preliminary reset and initialization page 424 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.3 step 2: reset the np4gs3 once all configuration i/os are set and the clocks are running, the np4gs3 can be reset using the blade_reset signal. this signal must be held active for a minimum of 1 s and then returned to its inactive state. internal logic requires an additional 101 s to allow the plls to lock and all internal logic to be reset. the pci bus must be idle during this interval to ensure proper initialization of the pci bus state machine. at this point, the np4gs3 is ready to be booted. in addition to the blade_reset signal, the np4gs3 supports two other ? soft ? reset mechanisms: the soft_reset register allows the entire np4gs3 to be reset (just like a blade_reset ), and the powerpc_reset register allows the powerpc core to be reset. a blade_reset or soft_reset causes the powerpc_reset register to activate and hold the powerpc core in reset until this register is cleared. therefore, a blade_reset resets the entire np4gs3 and holds the powerpc core in reset until it is released by the epc or an external host using the pci bus.
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 425 of 554 11.4 step 3: boot 11.4.1 boot the embedded processor complex (epc) booting the np4gs3's epc involves loading the internal picocode instruction space and turning over control of execution to the gfh thread. the gfh thread executes the loaded picocode and completes the appro- priate steps in the bringup process. the np4gs3 supports four ways to load the internal picocode instruction space: through the spm interface logic, through the pci bus from an external host, through the embedded powerpc, or through cabwatch. when using the last three methods, once the picocode instruction space is loaded, the bd_override bit must be set to ? 1 ? in the boot override register (see 13.13.4 boot override register (boot_override) on page 463), which causes the gfh thread to start the code stored at address '0'. 11.4.2 boot the powerpc there are two steps in the process of booting the np4gs3's embedded powerpc. first, using the boot_ppc i/o, the powerpc support logic must be configured to boot the powerpc from either the external d6 dram or the pci bus. second, the powerpc must be released to execute the appropriate boot code. the powerpc boot code can be mapped either into pci address space or into the np4gs3's external d6 dram, depending on the setting of the boot_ppc i/o. if an external host processor is used on the pci bus, it should use the pci configuration protocol to set the np4gs3's pci target maps for access into the network processor's internal address space. if the boot_ppc i/o chooses the pci bus, the internal plb bus addresses x'fffe 0000' through x'ffff ffff' are mapped to pci address space. once the powerpc_reset register is cleared (by either the epc or by an external host across the pci bus), the powerpc will fetch and execute the boot code from across the pci bus. if the boot_ppc i/o chooses the external d6 dram, the d6 dram must be written with the appropriate boot code and the boot_redir_inst registers must be written to point to the code in the d6 dram before the powerpc_reset register is released. the internal logic maps the powerpc's boot addresses of x'ffff ffe0' through x'ffff fffc' to the boot_redir_inst registers and the remaining boot code is fetched from the external d6 dram. the d6 dram and the boot_redir_inst registers can be written by either the epc or an external host on the pci bus. when everything is set up, use a cab write to clear the powerpc_reset register to allow the powerpc core to execute the boot code. 11.4.3 boot summary the epc must be booted by first loading its picocode instructions (by either the spm, an external pci host, the embedded powerpc, or cabwatch) and then issuing the boot done signal (by the picocode loader). if the embedded powerpc is to be used, it must have its instruction space loaded (if d6 is used), then pointing the boot logic to the appropriate boot location (pci or d6), and finally releasing the powerpc_reset register (by either the epc, an external pci host, or cabwatch). once one or both systems are booted, the following steps can be performed by one or both processing complexes. (some accesses to external memories can only be performed by the epc complex.)
ibm powernp np4gs3 network processor preliminary reset and initialization page 426 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.5 step 4: setup 1 setup 1 is needed to set some low level hardware functions that enable the np4gs3 to interface with its external drams and to configure some internal registers that enable the execution of step 5: diagnostics 1. setup 1 should configure or check the following registers according to the system setup and usage: table 11-3. setup 1 checklist register fields notes memory configuration register all this register is set to match the populated external memories. dram parameter register all this register is set to match the external dram's characteristics. thread enable register all this register enables the non-gfh threads. the gfh is always enabled. initialization register dram cntl start this bit starts the dram interfaces. initialization done register cm init done e_ds init done these bits indicate that the dram initialization has completed.
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 427 of 554 11.6 step 5: diagnostics 1 diagnostics 1 tests internal registers, internal memories, and external memories as required by the diagnos- tics program (read and write tests). this step comes before the hardware initialization step because several of these structures will be initialized by the hardware to contain functional data structures. by testing these structures first, the diagnostics program does not need to be concerned with corrupting the contents of these locations during hardware initialization. care must be taken that the values written to the structures do not force an undesirable situation (such as soft resetting the device). however, most of these structures can be tested by the diagnostics program to ensure proper operation. table 11-4 lists some of the structures that could be tested by this step. table 11-4. diagnostics 1 checklist structure test notes phase locked loop fail verify all plls locked if any pll fails, any further operation is questionable. dppu processors alu, scratch memory, internal processor registers test each thread. ingress data store read/write egress data store read/write control store d0-d4 read/write control store d6 read/write coordinated with powerpc code loading control store h0-h1 read/write counter definition table configure set up to test counters. counter memory read/write/add policy manager memory read/write egress qcbs read/write egress rcb read/write egress target port queues read/write mcca read/write pmm rx/tx counters read/write pmm sa tables read/write ingress bcb read/write ingress fcb read/write not all fields are testable. cia memory read/write ingress flow control probability memory read/write egress flow control probability memory read/write dasl-a picocode memory read/write dasl-b picocode memory read/write various internal configuration registers read/write not all fields will be testable. care must be taken when chang- ing certain control bits.
ibm powernp np4gs3 network processor preliminary reset and initialization page 428 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.7 step 6: setup 2 setup 2 is needed to set up the hardware for self-initialization and to configure the hardware structures for operational state. these configuration registers must be set to the desirable values based on the system design and usage: table 11-5. setup 2 checklist register fields notes master grant mode all tb mode all egress reassembly sequence check all aborted frame reassembly action control all packing control all ingress bcb_fq thresholds all egress sdm stack threshold all free queue extended stack maximum size all egress fq thresholds all fcb free queue size register (fcb_fq_max) all dmu configuration all dmu configuration can be postponed until step 10: configure if dmu init start is also postponed. dmu for cpf must be done dur- ing setup 2. if the dmu is configured, the appropriate external physical devices must also be configured. note that external physical devices should be held in reset until the dmu configura- tion is completed. packet over sonet control all pos configuration can be postponed until step 10: configure if dmu init start is also postponed. ethernet encapsulation type for control all ethernet encapsulation type for data all dasl picocode memory both a and b written with appropriate dasl picocode dasl initialization and configuration primary set dasl init can be postponed until the configure step if dasl start is also postponed and cpf is locally attached. dasl initialization and configuration dasl wrap mode dasl wrap mode can be postponed until the configure step. dasl wrap all
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 429 of 554 11.8 step 7: hardware initialization hardware initialization allows the np4gs3 to self-initialize several internal structures, thereby decreasing the overall time required to prepare the processor for operation. several internal structures will be initialized with free lists, default values, or initial states in order to accept the first guided frames from the cpf. once these data structures are initialized, the picocode should not modify them with further read/write diagnostics. to initiate the hardware self-initialization, the registers shown in table 11-6 need to be written. table 11-6. hardware initialization checklist register fields notes dasl start all only start the dasl interface after the primary set of dasl con- figuration bits have been configured. dasl initialization and configuration alt_ena only start the alternate dasl after the primary dasl has been started. initialization dmu set only start each dmu after its associated dmu configuration has been set. initialization functional island starts all other islands ? self-initialization.
ibm powernp np4gs3 network processor preliminary reset and initialization page 430 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.9 step 8: diagnostics 2 diagnostics 2 determines if the np4gs3 is ready for operation and allows testing of data flow paths. the items listed in table 11-7 should be set up, checked, and/or tested. table 11-7. diagnostic 2 checklist register fields notes initialization done all started in hardware initialization step the code polls this register until a timeout occurs or all expected bits are set. dasl timeout = 20 ms e_eds timeout = 15 ms dasl initialization and configuration pri_sync_term if primary dasl was initialized and sync termination should occur from the network processor, this register should be set to cause idle cells to be sent. dasl initialization and configuration alt_sync_term if alternate dasl was initialized and sync termination should occur from the network processor, this register should be set to cause idle cells to be sent. lu def table read/write test these structures can only be tested after hardware initializa- tion. smt compare table read/write test these structures can only be tested after hardware initializa- tion. tree search free queues read/write test these structures can only be tested after hardware initializa- tion. not all fields are testable. port configuration table all set to default values for diagnostics 2 lu def table all set to default values for diagnostics 2 ingress target port data storage map all set to default values for diagnostics 2 target port data storage map all set to default values for diagnostics 2 build frame on egress side lease twins and store test frame in egress data store half wrap wrap test frame from egress to ingress using wrap dmu full wrap wrap ingress frame back to egress side if dasl in wrap mode or dasl has been completely configured including target blade information. external wrap if external physical devices are configured, full external wraps can be performed. tree searches to test the tree search logic, tree searches can be performed on a pre-built sample tree written to memory.
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 431 of 554 11.10 step 9: operational after the diagnostics 2 tests have finished, any previously written default values may need to be updated to allow this step to proceed. if all diagnostics have passed, the operational signal can be activated to indicate to an external cpf that the np4gs3 is ready to receive guided frames. this signal is activated by writing the np4gs3 ready register which then activates the operational i/o. if some portion of the diagnostics have not passed, the ready register should not be written. this causes the cpf to timeout and realize the diagnostic failure. to determine what portion of the diagnostics have failed, the system designer must make provisions at the board or system level to record the status in a location that is accessible by the cpf. one method is to provide an i 2 c interface to an external spm fpga which the cpf could access.
ibm powernp np4gs3 network processor preliminary reset and initialization page 432 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.11 step 10: configure after the operational signal has been activated, the cpf can send guided frames to the np4gs3 for func- tional configuration. items that can be configured include: table 11-8. configure checklist (page 1 of 2) register fields notes dasl initialization and configuration primary set the primary set can be configured if postponed during the setup 2 step. dasl initialization and configuration dasl wrap mode dasl wrap mode can be set if postponed during setup 2 step. dasl start all primary dasl can be started if postponed during setup 2 step. dasl initialization and configuration alt_ena the alternate set can be configured if postponed during the setup 2 step. initialization done p_dasl init done this bit should be polled if started during the configure step. initialization done a_dasl init done this bit should be polled if started during the configure step. dasl initialization and configuration pri_sync_term if primary dasl was initialized and sync termination should occur from the network processor, this register should be set to cause idle cells to be sent. dasl initialization and configuration alt_sync_term if alternate dasl was initialized and sync termination should occur from the network processor, this register should be set to cause idle cells to be sent. dmu configuration all configure now if dmu configuration was postponed during setup 2 step. if the dmu is configured, the appropriate external physical devices also must be configured. note that external physical devices should be held in reset until the dmu configuration is completed. packet over sonet control all configure now if pos configuration was postponed during setup 2 step. functional picocode all functional picocode should be loaded into the instruction memory. port config table all functional values for the pct should be set lu def table all functional values should be set. cia memory all functional values should be set. hardware classifier e_type all functional values should be set. hardware classifier sap all functional values should be set. hardware classifier ppp_type all functional values should be set. interrupt masks all functional values should be set. timer target all functional values should be set. interrupt target all functional values should be set. address bounds check control all functional values should be set. static table entries all any static table entries should be loaded. ingress target port data storage map all my target blade address all
ibm powernp np4gs3 preliminary network processor np3_dl_sec11_reset.fm.08 may 18, 2001 reset and initialization page 433 of 554 local target blade vector all local mc target blade vector all target port data storage map all egress qcbs all qd accuracy all sa address array all counter definition table all counters all any counters used by counter manager must be read/ cleared. policy manager memory all set up initial values for policies. table 11-8. configure checklist (page 2 of 2) register fields notes
ibm powernp np4gs3 network processor preliminary reset and initialization page 434 of 554 np3_dl_sec11_reset.fm.08 may 18, 2001 11.12 step 11: initialization complete once steps 1 through 10 are complete and all items on the checklists have been configured, the np4gs3 is ready for data traffic. the ports can be enabled (at the physical devices) and switch cells can start to flow.
ibm powernp np4gs3 preliminary network processor np3_dl_sec12_debug.fm.08 may 18, 2001 debug facilities page 435 of 554 12. debug facilities 12.1 debugging picoprocessors the np4gs3 provides several mechanisms to facilitate debugging of the picoprocessors. 12.1.1 single step each thread of the np4gs3 can be enabled individually for single step instruction execution. single step is defined as advancing the instruction address by one cycle and executing the instruction accordingly for enabled threads. coprocessors are not affected by single step mode. therefore, coprocessor operations that at ? live ? speed would take several cycles may seem to take only one cycle in single step mode. there are two ways to enable a thread for single step operation. the first is to write the single step enable register. this register is a single-step bit mask for each thread and can be accessed through the control access bus (cab). the second is the single step exception register. this register is also a bit mask, one for each thread, but when set indicates which threads are to be placed into single step mode on a class3 inter- rupt . when a thread is in single step mode, the thread can only be advanced by writing the single step command register. 12.1.2 break points the np4gs3 supports one instruction break point that is shared by all of the threads. when a thread ? s instruction address matches the break point address, a class3 level interrupt is generated. this causes all threads enabled in the single step exception register to enter single step mode. the break point address is configured in the break point address register. 12.1.3 cab accessible registers the scalar and array registers of the core language processor (clp) and of the dppu coprocessors are accessible through the cab for evaluation purposes. the clp ? s general purpose registers, which are directly accessible with the clp, are mapped to read only scalar registers on the control access bus.
ibm powernp np4gs3 network processor preliminary debug facilities page 436 of 554 np3_dl_sec12_debug.fm.08 may 18, 2001 12.2 riscwatch the np4gs3 supports riscwatch through the jtag interface. riscwatch is a hardware and software development tool for the embedded powerpc. it provides processor control and source-level debugging features, including: on-chip debugging via ieee 1149.1 (jtag) interface target monitor debugging open-source real-time operating system-aware debugging source-level and assembler debugging of c/c ++ executables real-time trace support via the risctrace feature for the powerpc 400 series network support that enables customers to remotely debug the systems they are developing supports industry standard embedded abi for powerpc and xcoff abi command-file support for automated test and command sequences simple and reliable 16-pin interface to the system the customer is developing ethernet to target jtag interface hardware multiple hosts supported intuitive and easy-to-use windowed user interface for more information, go to http://www-3.ibm.com/chips/techlib/techlib.nsf/products/riscwatch_debugger .
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 437 of 554 13. configuration the ibm powernp np4gs3 must be configured after internal diagnostics have run. configuration is performed by a cpf that generates guided traffic to write configuration registers. these configuration regis- ters are reset by the hardware to a minimal option set. the following sections describe all configuration registers and their reset state. a base address and offset is provided. 13.1 memory configuration the np4gs3 is supported by a number of memory subsystems as shown in figure 13-1 below. these memory subsystems contain data buffers and controls used by the np4gs3. the d0, d1, d2, d3, d4, ds_0, and ds_1 subsystems are required to operate the np4gs3 base configuration. memory subsystems z0, z1, and d6 are optional and provide additional functionality or capability when added to the required set of memory subsystems. in its base configuration, the np4gs3 does not perform enhanced scheduling, and has a limited look-up search capability. the enabling of memory subsystem interfaces is controlled by the contents of the memory configuration register. the bits in this register are set by hardware during reset to enable the base configura- tion. figure 13-1. np4gs3 memory subsystems d4 ibm powernp d6 z0 d0 d3 57 51 33 48 51 ds0 51 ds1 d1 33 d2 33 51 z1 39
ibm powernp np4gs3 network processor preliminary configuration page 438 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.1.1 memory configuration register (memory_config) the memory configuration register enables or disables memory interfaces. it also enables the egress sched- uler and the z1 memory interface required by the egress scheduler to operate. access type read/write base address x ? a000 0120 ? reserved sch_ena z0 d4 d3 d2 d1 d0 reserved d6 ds_1 ds_0 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:12 reserved sch_ena 11:10 00 scheduler enabled control enables both egress scheduler and z1 mem- ory interface. 00 scheduler disabled, z1 interface disabled 01 scheduler enabled, z1 interface enabled 10 scheduler disabled, z1 interface enabled (np4gs3b (r2.0) or later) 11 reserved when enabling the scheduler, the target port queue count qcnt_pq must be zero for all ports. when setting this value to 0, the target port queue count qcnt_pq+fq must be zero for all ports. z0 9 0 z0 memory subsystem interface enabled control 1 interface enabled 0 interface disabled d4 8 1 d4 memory subsystem interface enabled control 1 interface enabled 0 interface disabled d3 7 1 d3 memory subsystem interface enabled control 1 interface enabled 0 interface disabled d2 6 1 d2 memory subsystem interface enabled control 1 interface enabled 0 interface disabled d1 5 1 d1 memory subsystem interface enabled control 1 interface enabled 0 interface disabled d0 4 1 d0 memory subsystem interface enabled control 1 interface enabled 0 interface disabled reserved 3 0 reserved d6 2 0 d6 memory subsystem interface enabled control 1 interface enabled 0 interface disabled
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 439 of 554 ds_1 1 1 ds_1 memory subsystem interface enabled control 1 interface enabled 0 interface disabled ds_0 0 1 ds_0 memory subsystem interface enabled control 1 interface enabled 0 interface disabled field name bit(s) reset description
ibm powernp np4gs3 network processor preliminary configuration page 440 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.1.2 dram parameter register (dram_parm) the dram parameter register controls the operation of the drams used by the epc and the egress eds. these drams are controlled separately. base address offset 0 np4gs3b (r2.0) base address offset 0 access type read/write base address x ? a000 2400 ? epc_dram_parms eds_dram_parms strobe_cntl reserved restricted cs_bank_r_w_skip cas_ latency 8_bank_ena 11/10 drive_strength dll_disable dqs_clamping_ena dram_size fet_cntl_ena d6_parity_mode d0_width restricted ds_error_checking_disable d4_error_checking_disable cas_ latency 8_bank_ena 11/10 drive_strength dll_disable dqs_clamping_ena dram_size fet_cntl_ena 313029282726252423222120191817161514131211109876543210 epc_dram_parms eds_dram_parms strobe_cntl reserved cs_refresh_rate cs_bank_r_w_skip cas_ latency 8_bank_ena 11/10 drive_strength dll_disable dqs_clamping_ena dram_size fet_cntl_ena reserved d0_width ds_d4_refresh_rate ds_error_checking_disable d4_error_checking_disable cas_ latency 8_bank_ena 11/10 drive_strength dll_disable dqs_clamping_ena dram_size fet_cntl_ena 313029282726252423222120191817161514131211109876543210
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 441 of 554 np4gs3b (r2.0) base address offset 1 base address offset 0 epc_dram_parms eds_dram_parms reserved d6_refresh rate d6_bank_rw_skp cas_latency 11/10 drive_ strength dll_ disable dqs_ clamping_ ena d6_dram_size fet_cntl_ena d6_parity_mode 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description strobe_cntl 31:30 00 strobe control for ddr dram interfaces. 00 each ddr dram interface uses one strobe (xx_dqs) for each byte of data lines. 01 d0, d1, d2, d3 ddr dram interface use one strobe (xx_dqs) for each byte of data lines. ds0, ds1, and d4 ddr dram interface use one strobe (xx_dqs) for all data lines. 10 reserved 11 reserved note: the above is available in np4gs3b (r2.0) or later. reserved 29 reserved restricted 28 1 restricted use only. do not modify. cs_refresh_rate (np4gs3b (r2.0)) 28 1 cs_refresh_rate. controls the refresh rate for d0, d1, d2 and d3. 0 double refresh rate (7.5 s) 1 normal refresh rate (15 s) cs_bank_rw_skp 27 0 this bit controls the number of banks that must be skipped within a dram access window when switching from a read to a write. 0 skip 1 bank 1 skip 2 banks the proper setting for this bit is related to the dram timing specifications and the cas_latency value. this control is for all control stores (d0, d1, d2, d3, d6). cs_bank_rw_skp (np4gs3b (r2.0)) 27 0 this bit controls the number of banks that must be skipped within a dram access window when switching from a read to a write. 0 skip 1 bank. cas_latency (bits 26:24) must be set to ? 010 ? to support this option. 1 skip 2 banks the proper setting for this bit is related to the times of the dram with respect to the cas_latency value. this control is for d0, d1, d2 and d3 control stores.
ibm powernp np4gs3 network processor preliminary configuration page 442 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 cas_latency 26:24, 10:8 010 dram column address strobe latency value corresponds to the dram ? s read latency measured from the column address. 000 - 001 reserved 010 2 clock cycles (pc266a-compliant drams) 011 3 clock cycles 100 - 101 reserved 110 2.5 clock cycles (pc266b-compliant drams) 111 reserved 8_bank_enable 23, 7 0 eight bank addressing mode enable control value. for proper operation, must be set to 0. 11/10 22, 6 1 eleven or ten cycle dram control value controls the number of core clock cycles the dram controller uses to define an access window. 0 10-cycle dram 1 11-cycle dram drive_strength 21, 5 0 dram drive strength control 0 strong 1weak dll_disable 20, 4 0 dll disable control. for proper operation, must be set to 0. dqs_clamping_ena 19, 3 0 dqs clamping enable control. for proper operation, must be set to 0. dram_size 18:17 00 dram size indicates the size of the control stores d0-d3 in drams. 00 4x1mx16 ddr dram, burst = 4 01 4x2mx16 ddr dram, burst = 4 10 4x4mx16 ddr dram, burst = 4 11 reserved fet_cntl_ena 16, 0 0 fet control enable control. for proper operation, must be set to 0. reserved 15 reserved (np4gs3b (r2.0)). d6_parity_mode 15 1 dram d6 parity mode disable control value (np4gs3a (r1.1)). 0 d6 interface supports an additional two ddr drams which sup- port byte parity. the hardware generates on write and checks parity on read. 1 d6 interface does not support parity. d0_width 14 1 d0 width control indicates if one or two drams are used for the d0 cs. 0 single wide configuration using one dram. a single bank access provides 64 bits of data. 1 double wide configuration using two drams. a single bank access provides 128 bits of data. restricted 13 0 restricted use only (np4gs3a (r1.1)). do not modify. ds_d4_refresh_rate 13 0 refresh rate control for ds0, ds1 and d4 (np4gs3b (r2.0)). 0 normal refresh rate (15 s) 1 double refresh rate (7.5 s) ds_error_checking_disable 12 0 egress datastore error checking disable control. when this field is set to 1, all dram error checking for the egress datastore is disabled. d4_error_checking_disable 11 0 d4 dram error checking disable. when this field is set to 1, all dram error checking for the d4 dram is disabled. field name bit(s) reset description
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 443 of 554 np4gs3b (r2.0) base address offset 1 dram_size 2:1 00 dram size indicates the size of the egress datastore and d4 drams. 00 4x1mx16 ddr dram, burst = 4, x2 01 4x2mx16 ddr dram, burst = 4, x2 10 4x4mx16 ddr dram, burst = 4, x2 11 reserved the setting of this field affects the size of the extended stack queues as follows: gq 00 01 10 11 gtq 48 k 96 k 192 k reserved gpq 48 k 96 k 192 k reserved gfq 96 k 192 k 384 k reserved gr0 96 k 192 k 384 k reserved gr1 96 k 192 k 384 k reserved gb0 96 k 192 k 384 k reserved gb1 96 k 192 k 384 k reserved discard 96 k 192 k 384 k reserved field name bit(s) reset description reserved 31:14 reserved. d6_refresh_rate 13 0 this field controls the refresh rate for d6. 0 normal refresh rate (15 s) 1 double refresh rate (7.5 s) d6_bank_rw_skp 12 0 this controls the number of banks that must be skipped within a dram access window when switching from a read to a write. 0 skip 1 bank. cas_latency (bits11:9) must be set to '010' to sup- port this option. 1 skip 2 banks the proper setting for this is related to the dram timing specifications and the cas_latency value. this control is for d6 only. cas_latency 11:9 010 dram column address strobe latency value corresponds to the dram ? s read latency measured from the column address. 000 - 001 reserved 010 2 clock cycles (pc266a-compliant drams) 011 3 clock cycles 100 - 101 reserved 110 2.5 clock cycles (pc266b-compliant drams) 111 reserved 11/10 80 eleven- or ten-cycle dram control value controls the number of core clock cycles the dram controller uses to define an access window. 0 10-cycle dram 1 11-cycle dram drive_strength 7 0 dram drive strength control 0 strong 1weak dll_disable 6 0 dll disable control. for proper operation, must be set to 0. dqs_clamping_ena 5 0 dqs clamping enable control. for proper operation, must be set to 0. field name bit(s) reset description
ibm powernp np4gs3 network processor preliminary configuration page 444 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 end np4gs3b (r2.0) d6_dram_size 4:2 110 indicates the size of the d6 dram. 000 4x1mx16 001 4x2mx16 010 4x4mx16 011 reserved 100 4x4mx4 101 4x8mx4 110 4x16mx4 111 reserved fet_cntl_ena 1 0 fet control enable control. for proper operation, must be set to 0. d6_parity_mode 0 1 dram d6 parity mode disable control value. 0 d6 interface supports an additional two ddr drams which sup- port byte parity. the hardware generates on write and checks parity on read. 1 d6 interface does not support parity. field name bit(s) reset description
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 445 of 554 13.2 master grant mode register (mg_mode) this configuration register configures the master grant i/o (mgrant_a, mgrant_b) for either nested priority or independent priority mode. access type read/write base address x'a000 0820' reserved mg_mode 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved mg_mode 0 0 0 the mgrant i/o is defined for nested priority encoding which is defined as: 00 no grant (on any priority) 01 priority 0 has grant 10 priorities 0 and 1 have grant 11 priorities 0, 1, and 2 have grant 1 the mgrant i/o is defined for independent priority encoding which is defined as: 00 no grant (on any priority) 01 priority 0 and 1 have grant 10 priority 2 has grant 11 priority 0, 1, and 2 have grant
ibm powernp np4gs3 network processor preliminary configuration page 446 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.3 tb mode register (tb_mode) the target blade mode configures the maximum number of target network processors supported. access type read/write base address x'a000 0410' reserved tb_mode 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:2 reserved tb_mode 1:0 00 target blade mode. this field is used to define the target blade mode cur- rently in use by the npr. the field is defined as: 00 16 blade mode. valid addresses are 0:15. multicast is indicated as a 16-bit vector. 01 reserved 10 64 blade mode. valid unicast target blade field encodes are 0 through 63. multicast encodes are in the range of 512 through 65535. 11 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 447 of 554 13.4 egress reassembly sequence check register (e_reassembly_seq_ck) this configuration register enables sequence checking by the egress reassembly logic. the sequence checking insures that start of frame, optional middle of frame, and end of frame indications occur in the expected order for each cell of a frame being reassembled. each cell that does not indicate start or end of frame carries a sequence number that is checked for proper order. access type read/write base address x'a000 0420' reserved seq_chk_ena 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved seq_chk_ena 0 1 sequence check enable control 0 sequence checking disabled 1 sequence checking enabled for the e_eds
ibm powernp np4gs3 network processor preliminary configuration page 448 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.5 aborted frame reassembly action control register (afrac) this configuration register controls the action the hardware takes on a frame whose reassembly was aborted due to the receipt of an abort command in a cell header for the frame. access type read/write base address x'a000 0440' reserved afrac 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved afrac 0 0 aborted frame reassembly action control 0 aborted frames are enqueued to the discard queue. 1 aborted frames are enqueued with other frames on the associ- ated gq.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 449 of 554 13.6 packing registers 13.6.1 packing control register (pack_ctrl) this configuration register is used to enable the transmission of packed cells across the switch interface. when enabled, the i-sdm determines if there is available space in the current cell being prepared for trans- mission and if there is an available packing candidate. a frame can be packed only if it is unicast, has the same target blade destination as the current frame, and is longer than 48 bytes. the packed frame starts on the next qw boundary following the last byte of the current frame. this control does not affect the ability of the e_eds to receive packed cells. access type read/write base address x'a000 0480' reserved pack_ena 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved pack_ena 0 1 packing enabled flag 0 cell packing disabled 1 cell packing enabled
ibm powernp np4gs3 network processor preliminary configuration page 450 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.6.2 packing delay register (pack_dly) (np4gs3b (r2.0)) the packing delay register is used to modify the ingress scheduler actions by the use of programmable wait periods per tb based on sp media speed, as determined by the dm_bus_mode setting (see 13.22 data mover unit (dmu) configuration on page 492). during processing of a frame for transmission across the switch interface, when it is determined that the next service will be the last for the frame and if there is no packing candidate at that time, then the selected tb run queue is not included in subsequent service selec- tions. a timer is used per tb run queue to keep the queue out of the service selection until either the timer expires or a packing candidate (the corresponding sof ring goes from empty to non-empty) is available. frames from the wrap dmu are not affected by this mechanism. multicast frames (which include guided traffic) and the discard queues are not affected by this mechanism. this function is disabled by setting the packing delay to 0. access type read/write base address x'a020 8000' reserved oc3_delay oc12_delay fasten_delay gb_delay oc48_delay 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:20 reserved oc3_delay 19:16 0 maximum packing delay in units of 3840 ns oc12_delay 15:12 0 maximum packing delay in units of 960 ns fasten_delay 11:8 0 maximum packing delay in units of 4800 ns gb_delay 7:4 0 maximum packing delay in units of 480 ns oc48_delay 3:0 0 maximum packing delay in units of 240 ns
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 451 of 554 13.7 initialization control registers 13.7.1 initialization register (init) this register controls the initialization of the functional islands. each functional island and the dram control- lers begin initialization when the bits in this register are set to ? 1 ? . each functional island signals the successful completion of initialization by setting its bit in the init_done register. once a functional island has been initial- ized, changing the state of these bits will no longer have any effect until a reset occurs. access type read/write base address x ? a000 8100 ? dmu_init functional_island_init dram_cntl_start reserved dcba 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description dmu_init 31:28 0000 data mover unit initialization control individually initializes each dmu. functional_island_init 27 0 functional island initialization control. 0nop 1 functional islands start hardware initialization. completion of hardware initialization is reported in the initialization done regis- ter. dram_cntl_start 26 0 dram controller start control. 0nop 1 causes the dram controllers to initialize and start operation. when initialization completes, the dram controllers set the cs_init_done and e-ds init done bits of the initialization done register. reserved 25:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 452 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.7.2 initialization done register (init_done) this register indicates that functional islands have completed their initialization when the corresponding bits are set to ? 1 ? . this register tracks the initialization state of the functional islands. the gch reads this register during the initialization of the np4gs3. access type read only base address x ? a000 8200 ? e_sched_init_done dmu_init_done i_eds_init_done reserved epc_init_done reserved cs_init_done e_ds_init_done e_eds_init_done reserved reserved p_dasl_init_done a_dasl_init_done reserved dcbaw 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description e_sched_init_done 31 0 set to ? 1 ? by the hardware when the egress scheduler hardware com- pletes its initialization. dmu_init_done 30:26 0 set to ? 1 ? by the hardware when the dmu hardware has completed its ini- tialization. bit dmu 30 d 29 c 28 b 27 a 26 wrap i_eds_init_done 25 0 set to ? 1 ? by the hardware when the ingress eds completes its initializa- tion. reserved 24 reserved epc _ init_done 23 0 set to ? 1 ? by the hardware when the epc completes its initialization. reserved 22 reserved cs_init_done 21 0 set to ? 1 ? by the hardware when the control store's dram controller com- pletes its initialization. e_ds_init_done 20 0 set to ? 1 ? by the hardware when the egress data stores ? dram controller completes its initialization. e_eds_init_done 19 0 set to ? 1 ? by the hardware when the egress eds completes its initializa- tion reserved 18:17 reserved p_dasl_init_done 16 0 set to ? 1 ? by the hardware when the primary dasl completes its initializa- tion. the dasl interface continues to send synchronization cells. a_dasl_init_done 15 0 set to ? 1 ? by the hardware when the alternate dasl completes its initial- ization. the dasl interface continues to send synchronization cells. reserved 14:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 453 of 554 13.8 np4gs3 ready register (npr_ready) access type read/write base address x ? a004 0020 ? ready reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description ready 31 0 ready is configured by picocode to drive a chip pin to indicate that the np4gs3 has been initialized and is ready to receive guided traffic. 0 np4gs3 not ready 1 np4gs3 ready reserved 30:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 454 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.9 phase locked loop registers 13.9.1 phase locked loop fail register (pll_lock_fail) np4gs3a (r1.1) np4gs3b (r2.0) access type read only base address x ? a000 0220 ? reserved pll_c_lock pll_b_lock pll_a_lock fail 313029282726252423222120191817161514131211109876543210 reserved pllc_lock_failed pllb_lock_failed plla_lock_failed pll_c_lock pll_b_lock pll_a_lock fail 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:7 reserved. for np4gs3a (r1.1), the reserved range is 31 to 3. pllc_lock_failed 6 pllc_lock_failed. np4gs3b (r2.0). 0 pllc has stayed correctly locked continuously since the last reset. 1 pllc has lost lock. (lock was lost any time since the last reset) pllb_lock_failed 5 pllb_lock_failed. np4gs3b (r2.0). 0 pllb has stayed correctly locked continuously since the last reset. 1 pllb has lost lock. (lock was lost any time since the last reset) plla_lock_failed 4 plla_lock_failed. np4gs3b (r2.0). 0 plla has stayed correctly locked continuously since the last reset. 1 plla has lost lock. (lock was lost any time since the last reset) pll_c_lock 3 current status of lock indicator of the core clock pll 0 phase/frequency lock 1 phase/frequency seek pll_b_lock 2 current status of lock indicator of the dasl-b pll 0 phase/frequency lock 1 phase/frequency seek pll_a_lock 1 current status of lock indicator of the dasl-a pll 0 phase/frequency lock 1 phase/frequency seek
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 455 of 554 fail 0 phased locked loop fail indicator - indicates that an on-chip pll has failed. this field is written by the clock logic. 0 plls ok 1 pll failed if fail is indicated at the end of the reset interval (101 s after reset is started) the operational device i/o is set to ? 1 ? . after the end of the reset interval, a change in fail will not affect operational device i/o. note: for np4gs3b (r2.0), the operational device i/o is not affected by the above. field name bit(s) reset description
ibm powernp np4gs3 network processor preliminary configuration page 456 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.10 software controlled reset register (soft_reset) this register provides a control for software to reset the network processor. access type write only base address x ? a000 0240 ? full_reset reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description full_reset 31 0 full reset value resets the np4gs3 hardware via the picocode. this is the same full reset function provided by the blade_reset i/o. 0 reserved 1 np4gs3 performs an internal hardware reset. reserved 30:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 457 of 554 13.11 ingress free queue threshold configuration 13.11.1 bcb_fq threshold registers the value of the queue count in the bcb_fq control block is continuously compared to the values each of the three threshold registers contains. the result of this comparison affects the np4gs3 ? sflowcontrolmech- anisms. the values in these registers must be chosen such that bcb_fq_th_gt bcb_fq_th_0 bcb_fq_th_1 bcb_fq_th_2. for proper operation the minimum value for bcb_fq_th_gt is x'08'. 13.11.2 bcb_fq threshold for guided traffic (bcb_fq_th_gt) the ingress eds reads this register to determine when to discard all packets received. access type read/write base address x'a000 1080' bcb_fq_th_gt reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description bcb_fq_th_gt 31:27 x ? 08 ? bcb free queue threshold gt value is measured in units of individual buffers. for example, a threshold value of x ? 01 ? represents a threshold of one buffer. reserved 26:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 458 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.11.3 bcb_fq_threshold_0 / _1 / _2 registers (bcb_fq_th_0/_1/_2) access type read/write base address bcb_fq_th_0 x ? a000 1010 ? bcb_fq_th_1 x ? a000 1020 ? bcb_fq_th_2 x ? a000 1040 ? bcb_fq_th_0 / _1 / _2 reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description bcb_fq_th_0 bcb_fq_th_1 bcb_fq_th_2 31:24 x ? 00 ? bcb free queue threshold 0 / 1 / 2 value, as measured in units of 16 buffers. for example, a threshold value of x ? 01 ? represents a threshold of 16 buffers. the ingress eds reads this field to determine when to perform a discard action. violation of this threshold (bcb queue count is less than this threshold) sends an interrupt to the epc. when bcb_fq_th_0 is violated, the discard action is to perform partial packet discards. new data buffers are not allocated for frame traffic and portions of frames may be lost. when bcb_fq_th_1 is violated, the discard action is to perform packet discards. new data buffers are not allocated for new frame traffic, but frames already started continue to allocate buffers as needed. reserved 23:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 459 of 554 13.12 ingress target dmu data storage map register (i_tdmu_dsu) this register defines the egress data storage units (dsu) used for each dmu. access type read/write base address x ? a000 0180 ? dsu_encode reserved dmu_d dmu_c dmu_b dmu_a 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description dsu_encode 31:24 00 during enqueue operations, the values in this configuration register are loaded into the ingress frame control block ? s dsu field when a dsu value is not specified by the enqueue. the hardware determines the tar- get dmu from enqueue information and uses the corresponding field in this configuration register to load the dsu field. four fields are defined, one for each dmu, with the following encode: 00 dsu 0 01 dsu 1 10 reserved 11 dsu0, dsu1 bits dmu 31:30 dmu_d 29:28 dmu_c 27:26 dmu_b 24:25 dmu_a reserved 23:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 460 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13 embedded processor complex configuration 13.13.1 powerpc core reset register (powerpc_reset) this register contains a control value used to hold the powerpc core in reset state. access type read/write base address x ? a000 8010 ? ppc_reset reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description ppc_reset 31 1 powerpc core reset - holds the powerpc core in a reset state when set to 1. the rest of the powerpc functional island is not affected by this con- trol and can only be reset by a full reset. 0 powerpc core reset disabled 1 powerpc core held in reset reserved 30:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 461 of 554 13.13.2 powerpc boot redirection instruction registers (boot_redir_inst) in system implementations in which the embedded powerpc boots from the d6 dram, the mailbox and dram interface macro performs powerpc boot address redirection. under these conditions, the hardware provides instructions that redirect the boot sequence to a location in the powerpc ? s dram. storage for eight instructions is provided by the boot_redir_inst registers. the powerpc boot redirection instruction (boot_redir_inst) registers are accessed from the cab interface. these registers provide capacity for eight instructions for plb addresses x ? ffff ffe0 ? -x ? ffff fffc ? these instructions redirect the powerpc to boot from a location in the powerpc ? s dram and are configured before the ppc_reset is cleared. access type read/write base address x ? 3800 0110 ? -x ? 3800 0117 ? boot_redir_inst 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description boot_redir_inst 31:0 x ? 0000 0000 ? powerpc boot redirection instruction address values contains instructions used by the powerpc to redirect the boot sequence to a location in d6 dram. offset corresponding powerpc instruction address 0x ? ffff ffe0 ? 1x ? ffff ffe4 ? 2x ? ffff ffe8 ? 3x ? ffff ffec ? 4x ? ffff fff0 ? 5x ? ffff fff4 ? 6x ? ffff fff8 ? 7x ? ffff fffc ?
ibm powernp np4gs3 network processor preliminary configuration page 462 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13.3 watch dog reset enable register (wd_reset_ena) this register controls the action of a watch dog timer expiration. when set to 1, the second expiration of the watch dog timer causes a reset to the powerpc core. access type read/write base address x ? a000 4800 ? wd_reset_ena reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description wd_reset_ena 31 0 reset enable. 0 disable reset of powerpc core 1 enable reset of powerpc core on watch dog expire reserved 30:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 463 of 554 13.13.4 boot override register (boot_override) with the proper setting of the boot_picocode i/o (see table 2-30: miscellaneous pins in the np4gs3 datasheet), this register provides boot sequence control. when the boot_picocode i/o is held to 0, boot control is given over to the spm interface hardware. when the spm completes its boot actions, a por interrupt is set (interrupt vector 1, bit 0) which causes the gfh to start execution. when the boot_picocode i/o is held to 1, boot control is given over to an external source, either a host on the pci or the eppc. when boot is completed, a cab write is required to set bd_override to 1 which causes a por interrupt (interrupt vector 1, bit 0), causing the gfh to start execution. access type read/write base address x ? a000 8800 ? bl_override bd_override reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description bl_override 31 see description this is set to the value of the boot_picocode i/o during reset. 0 boot code is loaded by the spm interface state machine. 1 boot code is loaded by intervention of software. the boot code can be loaded using the cabwatch interface, using the pci bus, or by using the embedded power pc. when reset to ? 1 ? , a cab write can set this field to ? 0 ? to start the spm interface controlled boot sequence. the spm interface reads this field to control the behavior of its state machine. bd_override 30 0 boot done override control value. 0nop 1 when the spm interface controlled boot loading sequence is overridden, this bit is set after the epc ? s instruction memory has been loaded to start the epc. a cab write can set this field to ? 1 ? to indicate that the epc ? s instruction memory is loaded. the configuration register logic reads this field when generating the por interrupt to the epc. reserved 29:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 464 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13.5 thread enable register (thread_enable) this register contains control information used to enable or disable each thread. access type read/write bits 31:1 read only bit 0 base address x ? a000 8020 ? thread_num_ena (31:0) 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description thread_num_ena (31:1) 31:1 0 thread enable. 0 disabled 1 corresponding thread enabled for use thread_num_ena 0 0 1 thread 0 (the gfh) is always enabled and cannot be disabled through this bit.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 465 of 554 13.13.6 gfh data disable register (gfh_data_dis) this register is used to enable the dispatch to assign data frames to the gfh for processing. access type read/write base address x ? 24c0 0030 ? reserved gfh_data_dis 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:1 reserved gfh_data_dis 0 0 guided frame handler data enable control. 0 enabled 1 not enabled this field is configured to enable or disable the dispatching of data frames to the gfh.
ibm powernp np4gs3 network processor preliminary configuration page 466 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13.7 ingress maximum dcb entries (i_max_dcb) this register defines the maximum number of ingress frames that are currently allowed to be simultaneously serviced by the dispatch unit. this limits the total number of ingress frames occupying space in the dispatch control block. np4gs3b (r2.0) access type read/write base address x ? 2440 0c40 ? i_max_dcb reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description i_max_dcb 31:28 x ? 6 ? maximum number of ingress frames allowed service in the dispatch con- trol block. reserved 27:0 reserved i_max_dcb reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description i_max_dcb 31:27 x ? 80 ? maximum number of ingress frames allowed service in the dispatch con- trol block. reserved 26:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 467 of 554 13.13.8 egress maximum dcb entries (e_max_dcb) this register defines the maximum number of egress frames that are currently allowed to be simultaneously serviced by the dispatch unit. this limits the total number of egress frames occupying space in the dispatch control block. np4gs3b (r2.0) access type read/write base address x ? 2440 0c50 ? e_max_dcb reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description e_max_dcb 31:28 x ? 6 ? maximum number of egress frames allowed service in the dispatch con- trol block. reserved 27:0 reserved e_max_dcb reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description e_max_dcb 31:27 x ? 80 ? maximum number of ingress frames allowed service in the dispatch con- trol block. reserved 26:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 468 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13.9 my target blade address register (my_tb) this register contains the local blade address value. access type master copy read/write thread copies read only base addresses thread address thread address master copy x ? a000 4080 ? 16 x ? 2100 0890 ? 0x ? 2000 0890 ? 17 x ? 2110 0890 ? 1x ? 2010 0890 ? 18 x ? 2120 0890 ? 2x ? 2020 0890 ? 19 x ? 2130 0890 ? 3x ? 2030 0890 ? 20 x ? 2140 0890 ? 4x ? 2040 0890 ? 21 x ? 2150 0890 ? 5x ? 2050 0890 ? 22 x ? 2160 0890 ? 6x ? 2060 0890 ? 23 x ? 2170 0890 ? 7x ? 2070 0890 ? 24 x ? 2180 0890 ? 8x ? 2080 0890 ? 25 x ? 2190 0890 ? 9x ? 2090 0890 ? 26 x ? 21a0 0890 ? 10 x ? 20a0 0890 ? 27 x ? 21b0 0890 ? 11 x ? 20b0 0890 ? 28 x ? 21c0 0890 ? 12 x ? 20c0 0890 ? 29 x ? 21d0 0890 ? 13 x ? 20d0 0890 ? 30 x ? 21e0 0890 ? 14 x ? 20e0 0890 ? 31 x ? 21f0 0890 ? 15 x ? 20f0 0890 ? reserved tb 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:6 reserved tb 5:0 0 blade address of this network processor. the value in this field is limited by the tb_mode configured (see section 13.3 on page 446). it is further limited when configured for dasl wrap mode (see section 13.29.1 on page 502) to a value of either 1 or 0.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 469 of 554 13.13.10 local target blade vector register (local_tb_vector) this register is used to determine the interface used when forwarding traffic. the register is defined as a target blade bit vector where each bit in the register represents a target blade address. for unicast traffic, in all target blade modes, the target blade address (defined in both the fcbpage and fcb2) is used to select the appropriate bit in the register for comparison. if the selected bit is set to 1, then local dasl interface is used to transmit the cell, otherwise the remote dasl interface is used. for multicast traffic (in 16 blade mode only) the target blade address, (defined in both the fcbpage and fcb2 as a bit vector) is compared bit by bit to the contents of this register. when a bit in the target blade address is set to 1, and the corresponding bit in this register is also set to 1, then the local dasl interface is used to transmit the cell. when a bit in the target blade address is set to 1 and the corresponding bit in this register is set to 0, then the remote dasl interface is used to transmit the cell. for multicast traffic (in 64 blade mode) the local mctarget blade vector register is used (see local mctarget blade vector register (local_mc_tb_max) on page 470). base address offset 0 base address offset 1 base address offset 0 base address offset 1 access type read/write base addresses x ? a000 4100 ? tb(0:31) 313029282726252423222120191817161514131211109876543210 tb(32:63) 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description tb(0:31) 31:0 0 this is a bit vector representing target blade addresses 0 to 31. field name bit(s) reset description tb(32:63) 31:0 0 this is a bit vector representing target blade addresses 32 to 63.
ibm powernp np4gs3 network processor preliminary configuration page 470 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.13.11 local mctarget blade vector register (local_mc_tb_max) when configured for 64 blade mode, with both the dasl-a and the dasl-b active, this register is used to determine the interface to be used when forwarding multicast traffic. the target blade address (defined in both the fcbpage and fcb2) is compared to the value in the tb_multicast_indentifier field. if the target blade address is less than this value, then the local dasl interface is used to transmit the cell, otherwise the remote dasl interface is used. the tb_multicast_indentifier field is reset to 0, causing all multicast traffic to use the remote dasl, during power on. access type read/write base addresses x'a000 4200' tb_multicast_identifier reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description tb_multicast_identifier 31:16 0 multicast local maximum. the tb field for a frame is compared to the value of this field; when smaller the frame is local. used only when not configured for 16 blade mode. reserved 15:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 471 of 554 13.13.12 ordered semaphore enable register (ordered_sem_ena) (np4gs3b (r2.0)) this register enables ordered semaphore operation in the np4gs3. when ordered semaphores are enabled, a thread is restricted to only locking one semaphore at a time. when ordered semaphores are disabled, a thread may hold locks on two unordered semaphores at a time. failure to follow these restrictions will result in unpredictable operation. access type read/write base addresses x'2500 0180' reserved ordered_ena 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:2 reserved ordered_ena 1:0 0 00 ordered semaphores disabled 01 ordered semaphores enabled for ordered semaphore id queue 0 10 ordered semaphores enabled for ordered semaphore id queue 1 11 ordered semaphores enabled for both ordered semaphore id queues 0 and 1.
ibm powernp np4gs3 network processor preliminary configuration page 472 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.14 flow control structures 13.14.1 ingress flow control hardware structures 13.14.1.1 ingress transmit probability memory register (i_tx_prob_mem) the ingress flow control hardware contains an internal memory that holds 64 different transmit probabilities for flow control. the probability memory occupies 16 entries in the cab address space. each probability entry in the cab is 32 bits wide and contains four 7-bit probabilities. the 4-bit access address to each probability entry comprises two components: the 3-bit qosclass field (qqq) and the 1-bit remote egress status bus value for the current priority (t). the address is formed as qqqt. the qosclass is taken from the ingress fcbpage. the remote egress status bus value for the current priority reflects the threshold status of the egress ? leased twin count. the ingress flow control hardware accesses probabilities within each probability memory entry by using the 2-bit color portion of the fc_info field taken from the ingress fcbpage as an index. the probabilities are organized as shown below. access type read/write base address x ? 3000 00#0 ? note: ? the base address is listed with a ? # ? replacing one of the hex digits. the ? # ? ranges from x ? 0 ? to x ? f ? , and indicates which probability entry is being referenced. reserved prob_0 reserved prob_1 reserved prob_2 reserved prob_3 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31 reserved prob_0 30:24 transmit probability 0 - transmit probability accessed when color is ? 11 ? reserved 23 reserved prob_1 22:16 transmit probability 1 - transmit probability accessed when color is ? 10 ? reserved 15 reserved prob_2 14:8 transmit probability 2 - transmit probability accessed when color is ? 01 ? reserved 7 reserved prob_3 6:0 transmit probability 3 - transmit probability accessed when color is ? 00 ?
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 473 of 554 13.14.1.2 ingress pseudo-random number register (i_rand_num) this register contains a 32-bit pseudo-random number used in the flow control algorithms. the cab accesses this register in order to modify its starting point in the pseudo-random sequence. however, a write to this register is not necessary to start the pseudo-random sequence: it starts generating pseudo-random numbers as soon as the reset is finished. access type read/write base addresses x ? 3000 0100 ? rand_num 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description rand_num 31:0 32-bit pseudo-random number
ibm powernp np4gs3 network processor preliminary configuration page 474 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.14.1.3 free queue thresholds register (fq_th) this register contains three thresholds that are compared against the ingress free queue. the results of this comparison are used in the flow control algorithms. thresholds are in units of 16 buffers. access type read/write base addresses x ? a040 0020 ? reserved fq_sbfq_th fq_p0_th fq_p1_th 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:24 reserved fq_sbfq_th 23:16 x ? ff ? threshold for comparison to the ingress free queue count (bcb_fq). when the number of buffers indicated by bcb_fq is less than the num- ber of buffers indicated by fq_sbfq_th, then i_freeq_th i/o is set to 1. fq_p0_th 15:8 x ? ff ? threshold for comparison to the ingress free queue count (bcb_fq), when determining flow control actions against priority 0 traffic. when the number of buffers indicated by bcb_fq is less than the num- ber of buffers indicated by fq_p0_th, then flow control hardware dis- cards the frame. fq_p1_th 7:0 x ? ff ? threshold for comparison to the ingress free queue count (bcb_fq), when determining flow control actions against priority 0 traffic. when the number of buffes indicated by bcb_fq is less than the number of buffers indicated by fq_p1_th, then flow control hardware discards the frame.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 475 of 554 13.14.2 egress flow control structures 13.14.2.1 egress transmit probability memory (e_tx_prob_mem) register the egress flow control hardware contains an internal memory that holds 64 different transmit probabilities for flow control. the probability memory occupies 16 entries in the cab address space. each probability entry in the cab is 32 bits wide and contains four 7-bit probabilities. an entry in the egress probability memory is accessed by using the 4-bit fc_info field taken from the egress fcbpage as an index. the egress flow control hardware uses a 2-bit address to access the probabilities within each probability memory entry. this address (formed as fp) comprises two components: a 1-bit ? threshold exceeded ? value for the current priority of the flow queue count (see section 4.7.1 on page 355), and a 1-bit ? threshold exceeded ? value for the current priority of the combined flow/port queue count (see section 4.7.2 on page 363). access type read/write base addresses x ? b000 00#0 ? note: the base address is listed with a ? # ? replacing one of the hex digits. the ? # ? ranges from x ? 0 ? to x ? f ? , and indicates which probability entry is being referenced. reserved prob_0 reserved prob_1 reserved prob_2 reserved prob_3 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31 reserved prob_0 30:24 transmit probability 0 - transmit probability accessed when ? fp ? is ? 11 ? reserved 23 reserved prob_1 22:16 transmit probability 1 - transmit probability accessed when ? fp ? is ? 10 ? reserved 15 reserved prob_2 14:8 transmit probability 2 - transmit probability accessed when ? fp ? is ? 01 ? reserved 7 reserved prob_3 6:0 transmit probability 3 - transmit probability accessed when ? fp ? is ? 00 ?
ibm powernp np4gs3 network processor preliminary configuration page 476 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.14.2.2 egress pseudo-random number (e_rand_num) this register contains a 32-bit pseudo-random number used in the flow control algorithms. the cab accesses this register in order to modify its starting point in the pseudo-random sequence. however, a write to this register is not necessary to start the pseudo-random sequence: it starts generating pseudo-random numbers as soon as the reset is finished. access type read/write base addresses x ? b000 0100 ? rand_num 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description rand_num 31:0 32-bit pseudo-random number
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 477 of 554 13.14.2.3 p0 twin count threshold (p0_twin_th) this register contains the threshold value that is compared against the priority 0 twin count. the results of this comparison are used in the flow control algorithms. 13.14.2.4 p1 twin count threshold (p1_twin_th) this register contains the threshold value that is compared against the priority 1 twin count. the results of this comparison are used in the flow control algorithms. access type read/write base addresses x ? a040 0100 ? reserved p0_twin_th 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:19 reserved p0_twin_th 18:0 x ? 0 0000 ? p0 twin count threshold value used in the egress flow control hard- ware algorithm. access type read/write base addresses x ? a040 0200 ? reserved p1_twin_th 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:19 reserved p1_twin_th 18:0 x ? 0 0000 ? p1 twin count threshold value used in the egress flow control hard- ware algorithm.
ibm powernp np4gs3 network processor preliminary configuration page 478 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.14.2.5 egress p0 twin count ewma threshold register (e_p0_twin_ewma_th) this register contains the threshold value that is compared against the egress p0 twin count ewma (see section 4.11.2 on page 380). the results of this comparison are placed on the remote egress status bus. 13.14.2.6 egress p1 twin count ewma threshold register (e_p1_twin_ewma_th) this register contains the threshold value that is compared against the egress p1 twin count ewma (see section 4.11.3 on page 381). the results of this comparison are placed on the remote egress status bus. access type read/write base addresses x ? a040 0400 ? reserved p0_twin_ewma_th 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:19 reserved p0_twin_ewma_th 18:0 x ? 0 0000 ? priority 0 egress leased twin count exponentially weighted moving average threshold value. this value is compared against the egress p0 twin count ewma and its result placed on the remote egress status bus. access type read/write base addresses x ? a040 0800 ? reserved p1_twin_ewma_th 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:19 reserved p1_twin_ewma_th 18:0 x ? 0 0000 ? egress priority 1 twin count exponentially weighted moving average threshold value. this value is compared against the egress p1 twin count ewma and its result placed on the remote egress status bus.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 479 of 554 13.14.3 exponentially weighted moving average constant (k) register (ewma_k) this register contains constant (k) values for the various exponentially weighted moving averages calcu- lated in the ingress and egress flow control hardware. the k value is encoded as follows: k encoding constant value 00 1 01 1/2 10 1/4 11 1/8 access type read/write base addresses x ? a040 0040 ? reserved e_fq_ewma_k e_twin_ewma_k i_fq_ewma_k 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:6 reserved e_fq_ewma_k 5:4 00 k value for the egress free queue count exponentially weighted moving average calculation in the egress flow control hardware. e_twin_ewma_k 3:2 00 k value for the egress p0/p1 twin count ewma calculation in the egress flow control hardware. i_fq_ewma_k 1:0 00 k value for the ingress free queue count exponentially weighted moving average calculation in the ingress flow control hardware.
ibm powernp np4gs3 network processor preliminary configuration page 480 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.14.4 exponentially weighted moving average sample period (t) register (ewma_t) this register contains the sample periods for the various exponentially weighted moving averages calculated in the ingress and egress flow control hardware. the values in this register are the number of 10 smulti- ples for the interval between calculations of the respective expwas. the computation of an ewma does not occur unless the respective field in this register is non-zero. access type read/write base addresses x ? a040 0080 ? reserved e_fq_ewma_t e_twin_ewma_t i_fq_ewma_t 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:30 reserved e_fq_ewma_t 29:20 x ? 000 ? sample period for the egress free queue count exponentially weighted moving average calculation in the egress flow control hardware. e_twin_ewma_t 19:10 x ? 000 ? sample period for the egress p0/p1 twin count ewma calculation in the egress flow control hardware. i_fq_ewma_t 9:0 x ? 000 ? sample period for the ingress free queue count exponentially weighted moving average calculation in the ingress flow control hardware.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 481 of 554 13.14.5 remote egress status bus configuration enables (res_data_cnf) this register controls operation of the remote egress status bus. the remote egress status bus is a 2-bit bus that allows communication between the system made up of all of the egress flow control hardware components and the system made up of all of the ingress flow control hardware. one bit of this bus is the sync pulse, and the other bit is tdm data reflecting the status of the egress leased twin counts as they relate to their respective thresholds. base address offset 0 base address offset 0 access type read/write base addresses x ? a000 0880 ? reserved e_res_data_en e_res_sync_en i_res_data_en 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:3 reserved e_res_data_en 2 1 egress remote egress status bus data enable. 0 sets the res_data i/o to a hi-z state. 1 places data about this np4gs3's egress data store congestion state. e_res_sync_en 1 0 egress remote egress status bus sync enable. when this field is set to 1, the np4gs3 transmits a sync pulse every remote egress status bus interval. only one network processor in a sys- tem will have this bit enabled. i_res_data_en 0 1 ingress remote egress status bus data enable. 0 disables use of the remote egress status bus for ingress flow con- trol. the ingress flow control hardware treats the remote egress status bus as if it contained all 0s. 1 enables use of the remote egress status bus for ingress flow con- trol. values are captured for each remote target blade and used for ingress flow control.
ibm powernp np4gs3 network processor preliminary configuration page 482 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.15 target port data storage map (tp_ds_map) register the target port data storage map indicates in which data store, ds_0 or ds_1 or both, that data is found for a port. each port is configured with two bits; when set to 1, it indicates the data is found in the corresponding data store. base address offset 0 base address offset 1 base address offset 2 access type read/write base address x ? a000 0140 ? dmu_d dmu_c port 9 port 8 port 7 port 6 port 5 port 4 port 3 port 2 port 1 port 0 port 9 port 8 port 7 port 6 port 5 port 4 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 313029282726252423222120191817161514131211109876543210 dmu_c dmu_b dmu_a port 3 port 2 port 1 port0 port 9 port 8 port 7 port 6 port 5 port 4 port 3 port 2 port 1 port 0 port 9 port 8 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 313029282726252423222120191817161514131211109876543210 dmu_a reserved port 7 port 6 port 5 port 4 port 3 port 2 port 1 port 0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 ds_1 ds_0 313029282726252423222120191817161514131211109876543210
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 483 of 554 base address offset 0 base address offset 1 field name bit(s) reset description dmu_d port 9 31:30 01 the relationship between individual bits and the datastore is shown in detail in the diagram above. dmu_d port 8 29:28 01 the relationship between individual bits and the datastore is shown in detail in figure 13.15: target port data storage map (tp_ds_map) reg- ister on page 482. dmu_d port 7 27:26 01 dmu_d port 6 25:24 01 dmu_d port 5 23:22 01 dmu_d port 4 21:20 01 dmu_d port 3 19:18 01 dmu_d port 2 17:16 01 dmu_d port 1 15:14 01 dmu_d port 0 13:12 01 dmu_c port 9 11:10 01 dmu_c port 8 9:8 01 dmu_c port 7 7:6 01 dmu_c port 6 5:4 01 dmu_c port 5 3:2 01 dmu_c port 4 1:0 01 field name bit(s) reset description dmu_c port 3 31:30 01 the relationship between individual bits and the datastore is shown in detail in figure 13.15: target port data storage map (tp_ds_map) reg- ister on page 482. dmu_c port 2 29:28 01 dmu_c port 1 27:26 01 dmu_c port 0 25:24 01 dmu_b port 9 23:22 01 dmu_b port 8 21:20 01 dmu_b port 7 19:18 01 dmu_b port 6 17:16 01 dmu_b port 5 15:14 01 dmu_b port 4 13:12 01 dmu_b port 3 11:10 01 dmu_b port 2 9:8 01 dmu_b port 1 7:6 01 dmu_b port 0 5:4 01 dmu_a port 9 3:2 01 dmu_a port 8 1:0 01
ibm powernp np4gs3 network processor preliminary configuration page 484 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 base address offset 2 field name bit(s) reset description dmu_a port 7 31:30 01 the relationship between individual bits and the datastore is shown in detail in figure 13.15: target port data storage map (tp_ds_map) reg- ister on page 482. dmu_a port 6 29:28 01 dmu_a port 5 27:26 01 dmu_a port 4 25:24 01 dmu_a port 3 23:22 01 dmu_a port 2 21:20 01 dmu_a port 1 19:18 01 dmu_a port 0 17:16 01 reserved 15:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 485 of 554 13.16 egress sdm stack threshold register (e_sdm_stack_th) access type read/write base address x ? a000 1800 ? threshold reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description threshold 31:28 x ? 8 ? e-sdm stack threshold value. when this threshold is violated (threshold value is less than the count of empty entries in the e-sdm stack), send grant is set to its disable state. reserved 27:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 486 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.17 free queue extended stack maximum size (fq_es_max) register this register sets the number of buffers that are released into the free queue and thus made available for the storage of received frames. the egress eds reads this register when building the fq_es. access type read/write base address x ? a000 2100 ? fq_es_max reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description fq_es_max 31:24 x ? 08 ? maximum size of the free queue extended stack measured in incre- ments of 2 k buffer twins. the egress eds reads this value when building the fq_es. the maximum size is limited by the ddr dram used by the egress data store. each 128-bit page holds six entries each. once this register is written, the hardware creates entries in the buffer free queue (fq) at a rate of 6 entries every 150 or 165 ns (rate is dependent on the setting of bit 6 of the dram parameter register - 11/10 ). the value in this register may be modified during operation. however, the new value may not be smaller than the current value. reserved 23:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 487 of 554 13.18 egress free queue thresholds a queue count is maintained by the free queue extended stack management hardware. the count value is continuously compared to the value contained in each of three threshold registers. the result of this compar- ison affects the np4gs3 ? s flow control mechanisms. the register values must be chosen such that fq_es_th_0 fq_es_th_1 fq_es_th_2. 13.18.1 fq_es_threshold_0 register (fq_es_th_0) access type read/write base address x ? a000 2010 ? fq_es_thresh_0 reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description fq_es_thresh_0 31:17 x ? 0000 ? free queue extended stack threshold 0 as measured in units of 16 twins. when this threshold is violated (the threshold value is greater than the number of remaining twins in the free queue): frame data received at the switch interface is discarded (the number of frames discarded is counted). frames that have started re-assembly that receive data while this threshold is violated are also discarded (all data associated with the frame is discarded). guided traffic data is not discarded. an interrupt is sent to the epc. reserved 16:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 488 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.18.2 fq_es_threshold_1 register (fq_es_th_1) 13.18.3 fq_es_threshold_2 register (fq_es_th_2) access type read/write base address x ? a000 2020 ? fq_es_thresh_1 reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description fq_es_thresh_1 31:17 x ? 0000 ? free queue extended stack threshold 1 as measured in units of 16 twins. when this threshold is violated (the threshold value is greater than the number of remaining twins in the free queue), an interrupt is sent to the epc. reserved 16:0 reserved access type read/write base address x ? a000 2040 ? fq_es_thresh_2 reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description fq_es_thresh_2 31:17 x ? 0000 ? free queue extended stack threshold 2 as measured in units of 16 twins. when this threshold is violated (the threshold value is greater than the number of remaining twins in the free queue), an interrupt is sent to the epc and, if enabled by dmu configuration, the ethernet preamble is reduced to 6 bytes reserved 16:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 489 of 554 13.19 discard flow qcb register (discard_qcb) this register is used by the egress hardware when the scheduler and flow control are enabled. this register contains the address of the flow qcb to be used when egress flow control actions require that the frame be discarded. this register and the qcb referenced by the address must be configured by hardware. see the ibm powernp np4gs3 databook, section 6, for details on configuring the qcb. when the scheduler is disabled, the register contains the target port queue to which the flow control discarded frames are sent. this value should be set to x ? 029 ? . access type read/write base address x ? a000 1400 ? reserved discard_qid 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:11 reserved discard_qid 10:0 x ? 029 ? the discard qid field contains the address of the qcb that has been con- figured for discarding egress frames while the scheduler is enabled. when the scheduler is disabled, this is the target port id (x ? 029 ? ) to which dis- carded frames due to flow control discard actions are sent.
ibm powernp np4gs3 network processor preliminary configuration page 490 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.20 bandwidth allocation register (bw_alloc_reg, np4gs3b (r2.0)) the bandwidth allocation register is used to assure bandwidth to the egress datastore for the epc datastore coprocessor and the dispatch unit. for the epc datastore coprocessor, this register provides assured band- width when writing to the egress data store. for the epc dispatch unit, it provides assured bandwidth when reading the egress data store. access type read/write base addresses x ? a000 2800 ? reserved ds_bw_alloc disp_bw_alloc 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:16 reserved ds_bw_alloc 15:8 0 data store coprocessor write bw allocation control value. an 8-bit count- down value. a value of 0 disables this function. this value is loaded into a control register and is decremented each data store access window. while the control register is non-zero, the switch interface has high priority for datastore write access. the epc data store coprocessor write requests must wait until there is a free window. once the control register has counted down to 0, the epc request for the data store write access window has high priority. the control register remains at zero until an epc data store coprocessor write request is serviced. whenever an epc data store write window request is serviced and the configuration register is non-zero, the control register is re-loaded with the configuration value. disp_bw_alloc 7:0 0 dispatch unit read bw allocation control value. an 8-bit countdown value. a value of 0 disables this function. the value is loaded into a control regis- ters and is decremented each data store access window. while the con- trol register is non-zero, the following priority is observed for read access to the egress data stores: 8/9th of the time 1/9th of the time 1. pmm - dmu a-d read requests pmm - dmu a-d read requests 2. discard port dispatch unit dispatch unit epc datastore coprocessor read 3. epc datastore coprocessor read discard port 4. pmm - wrap dmu pmm - wrap dmu once the control register decrements to zero, the following priority is observed: 8/9th of the time 1/9th of the time 1. dispatch unit dispatch unit 2. pmm - dmu a-d read requests pmm - dmu a-d read requests discard port epc datastore coprocessor read 3. epc datastore coprocessor read discard port 4. pmm - wrap dmu pmm - wrap dmu when the dispatch unit is serviced, or a dram refresh occurs for ds0/ ds1, the control register is re-loaded with the configuration value.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 491 of 554 13.21 frame control block fq size register (fcb_fq_max) this register sets the number of frame control blocks that are released to the fcb fq during initialization. this register must be set prior to initialization (see 13.7.1 initialization register (init) on page 451) to affect the fcb fq size. any changes after initialization will not affect the number of available fcbs. access type read/write base address x ? a000 2200 ? fcb_fq _max reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description fcb_fq_max 31:30 00 indicates the number of fcbs released into the fcb free queue during initialization. 00 128k 01 256k 10 512k 11 reserved reserved 29:0 reserved note: the following settings are recommended based on the size of the d4 dram which is configured in the dram_parm register (see 13.1.2 dram parameter register (dram_parm) on page 440) bits 2:1. dram_size fcb_fq_max 00 01 01 10 10 11 11 reserved
ibm powernp np4gs3 network processor preliminary configuration page 492 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.22 data mover unit (dmu) configuration there are four data mover units (dmu) configured for internal mac operation, for external connection detec- tion (i.e., attached control point detection), and for external bus operation (i.e., tmii, smii, tbi, and pos framer). base address offset 0 base address offset 1 base address offset 2 base address offset 3 access type base address offset 0 read only base address offset 1 read/write base address offset 2 read/write base address offset 3 read/write base address dmu_a x ? a001 0010 ? dmu_b x ? a001 0020 ? dmu_c x ? a001 0040 ? dmu_d x ? a001 0080 ? reserved in_reset cp_detect 313029282726252423222120191817161514131211109876543210 reserved framer_ac_strip ac_strip_ena framer_ac_insert ac_insert_ena bus_delay crc_type_32 vlan_chk_dis etype_chk_dis pause_chk_dis ignore_crc enet_catchup_ena tx_thresh dm_bus_mode tx_ena(9:0) 313029282726252423222120191817161514131211109876543210 reserved rx_ena (9:0) fdx/hdx (9:0) jumbo (9:0) 313029282726252423222120191817161514131211109876543210 reserved honor_pause (9:0) 313029282726252423222120191817161514131211109876543210
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 493 of 554 base address offset 0 base address offset 1 field name bit(s) reset description reserved 31:2 reserved in_reset 1 1 dmu in reset indicator originates in the clocking logic. 0 written when the clock logic removes the reset signal for the dmu. 1 dmu is held in reset mode. cp_detect 0 0 control point detected indicator value originates in the pmm. 0 control point processor connection not present 1 dmu detected a control point processor connection. field name bit(s) reset description reserved 31:28 reserved framer_ac_strip 27 0 configures the mac operation for an attached pos framer. 0 framer passes ac field to the network processor 1 framer does not pass ac field to the network processor. the crc checking performed is modified to account for the missing field. an ac value of x'ff03' is assumed. ac_strip_ena 26 0 configures the mac operation for an attached pos framer. 0 ac field is not stripped from the packet. 1 ac field is stripped from the packet prior to being stored in the ingress data store. framer_ac_insert 25 0 configures the mac operation for an attached pos framer. 0 ac field is assumed present in the packet sent to the framer. 1 ac field is not present in the packet sent to the framer. the framer inserts this field and crc generation is adjusted to account for the missing ac field. an ac value of x'ff03' is assumed. ac_insert_ena 24 1 configures the mac operation for an attached pos framer. 0 ac field is not inserted by the mac. for proper operation, framer_ac_insert must be set to 1. 1 ac field is inserted by the mac. for proper operation, framer_ac_insert must be set to 0. bus_delay 23 0 bus delay controls the length of the delay between a poll request being made and when the mac samples the framer ? s response. 0 sample is taken 1 cycle after the poll request 1 sample is taken 2 cycles after the poll request. crc_type_32 22 0 crc_type_32 controls the type of frame crc checking and generation performed in the pos mac. this field does not control crc generation for the ethernet mac. the ethernet mac can generate only 32-bit crc. 0 16-bit crc in use. 1 32-bit crc in use vlan_chk_dis 21 0 vlan checking disable control value. 0 enable vlan checking. 1 disable vlan checking by the dmu. etype_chk_dis 20 0 ethernet type checking disable control value. 0 enable dmu checking. 1 disable dmu checking of e_type_c and e_type_d
ibm powernp np4gs3 network processor preliminary configuration page 494 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 base address offset 2 pause_chk_dis 19 0 pause frame checking disable control value. 0 pause frames are processed by the mac. they are not sent to the ingress eds. 1 pause frames are not processed by the mac. the frames are sent to the ingress eds for service. see also honor_pause in offset 3 for additional control of the mac in rela- tion to pause frames. ignore_crc 18 0 ignore crc controls the behavior of crc checking for each dmu. 0 discard frames with bad crc 1 ignore bad crc enet_catchup_ena 17 1 ethernet mac catch up enabled. when enabled and fq_es_threshold_2 is violated, the mac uses a 6-byte preamble instead of a 7-byte preamble. 0 disabled 1 enabled tx_thresh 16:14 100 transmit threshold configures the number of cell buffers that must be filled before the transmission of a frame can start. 000 invalid 001-100 range available for all dmu_bus_mode settings 101-111 range available only for gigabit ethernet, pos oc-12, and pos oc-48 dmu_bus_mode settings dm_bus_mode 13:10 1010 data mover bus mode configures the mode in which the dm bus oper- ates. 0000 reserved 0001 10/100 ethernet smii mode 0010 gigabit ethernet gmii mode 0011 gigabit ethernet tbi mode 0100 pos oc-12 mode; non-polling pos support 0101 pos 4xoc-3 mode; polling pos support 0110-1001 reserved 1010 cp detect mode 1011 debug mode (dmu d only) 1100-1101 reserved 1110 dmu disabled 1111 pos oc-48 mode; quad dmu mode, non-polling, pos support. when configuring for this mode of operation, all 4 dmu configuration registers must be set up the same (all bits in all offsets must be identical). note: when configuring the dmu, the dmu_bus_mode field must be set first. subsequent cab writes can used to configure the remaining fields. tx_ena(9:0) 9:0 0 port transmit enable control is a bitwise enable for each port ? s transmit- ter. 0 disable port 1 enable port field name bit(s) reset description reserved 31:30 reserved rx_ena(9:0) 29:20 0 port receive enable control is a bitwise enable for each port ? s receiver. 0 disable port 1 enable port field name bit(s) reset description
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 495 of 554 base address offset 3 fdx/hdx(9:0) 19:10 0 full duplex or half duplex operation mode for ports 9 to 0 controls the mode of operation for the associated port. 0 half duplex (hdx) operation 1 full duplex (fdx) operation jumbo(9:0) 9:0 0 jumbo frame operation mode for ports 9 to 0 controls the mode of opera- tion for the associated port. 0 jumbo frames disabled 1 jumbo frames enabled field name bit(s) reset description reserved 31:10 reserved honor_pause(9:0) 9:0 0 honor pause control value is a bitwise control value for the port ? s pause function. 0 ignore pause frames received by corresponding port 1 pause when pause frame received by corresponding port field name bit(s) reset description
ibm powernp np4gs3 network processor preliminary configuration page 496 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.23 qd accuracy register (qd_acc) the qd accuracy register tunes the egress scheduler ? s wfq rings. the values assure fairness and some benefit to queues with lower defined qd values which expect better service when enqueueing to an empty queue. the value is also a scaling factor when servicing queues. configuration recommendations are depen- dent on the maximum frame sizes expected for a dmu. max frame size qd_acc_dmu 2k 6 9k 8 14 k 10 there is one field defined per media dmu (a-d). access type read/write base address x ? a002 4000 ? qd_acc_ dmu_d qd_acc_ dmu_c qd_acc_ dmu_b qd_acc_ dmu_a reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description qd_acc_dmu_d 31:28 0 qd accuracy value used for dmu_d qd_acc_dmu_c 27:24 0 qd accuracy value used for dmu_c qd_acc_dmu_b 23:20 0 qd accuracy value used for dmu_b qd_acc_dmu_a 19:16 0 qd accuracy value used for dmu_a reserved 15:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 497 of 554 13.24 packet over sonet control register (pos_ctrl) one configuration register per dmu is provided to control pos framer interaction. it configures transmit and receive burst sizes and sets the value used for ac field insertion. access type read/write base address dmu_a x ? a004 0100 ? dmu_b x ? a004 0200 ? dmu_c x ? a004 0400 ? dmu_d x ? a004 0800 ? tx_burst_size rx_burst_size a/c insert value 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description tx_burst_size 31:24 x'10' transmit burst size. used only for oc-3 modes of operation. oc-12 and oc-48 interfaces transmit until the framer de-asserts txpfa. when set to 0 the mac uses txpfa from the frame to stop transmission of data. when set to a value other than 0, the mac will burst data to the framer up to the burst size or until the framer drops txpfa. it is recommended that the low water mark in the frame be set to a value equal to or greater than the value of tx_burst_size. rx_burst_size 23:16 x'10' receive burst size a/c insert value 15:0 x'ff03' value used by the mac when ac_insert_ena is set to 1
ibm powernp np4gs3 network processor preliminary configuration page 498 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.25 packet over sonet maximum frame size (pos_max_fs) this register controls the maximum frame size supported by the network processor. pos permits frames up to 64 k bytes, however, the network processor is constrained to 14 k (14336) bytes maximum. this register allows setting for smaller frame sizes. frames received by the network processor that exceed the length specified by this register are aborted during reception. access type read/write base address x ? a004 0080 ? reserved pos_max_fs 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 31:14 reserved pos_max_fs 13:0 x ? 3800 ? packet over sonet maximum frame size sets the maximum frame size that the network processor can receive on a pos port. the value in this register is used to determine the length of a long frame for the ingress and egress long frame counters.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 499 of 554 13.26 ethernet encapsulation type register for control (e_type_c) this configuration register is used by the pmm to recognize ethernet-encapsulated guided frames. when the ethernet frame ? s type field matches this value, the mac da, sa, and type fields of the frame are stripped and the frame is queued onto the gfq. access type read/write base addresses x ? a001 1000 ? e_type reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description e_type 31:16 x ? 0000 ? ethernet type used for encapsulated guided traffic. reserved 15:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 500 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.27 ethernet encapsulation type register for data (e_type_d) this configuration register is used by the pmm to recognize ethernet-encapsulated data frames. when the ethernet frame ? s type field matches this value, the mac da, sa, and type fields of the frame are stripped and the frame is queued onto the gdq. access type read/write base addresses x ? a001 2000 ? e_type reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description e_type 31:16 x ? 0000 ? ethernet type used for encapsulated cp data traffic. reserved 15:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 501 of 554 13.28 source address array (sa_array) the sa array is a register array containing 64 source address values. the sa pointer from the egress fcb references elements of this register array. the value retrieved from the sa array is used to insert or overlay the sa field of transmitted frames during egress frame alteration. base address offset 0 base address offset 1 base address offset 0 base address offset 1 access type read/write base addresses note: each dmu listed below contains 64 entries. nibbles 5 and 6 of the base address is incremented by x ? 01 ? for each successive entry, represented by ? ## ? ,and ranging from x ? 00 ? to x ? 3f ? . the word offset is represented by ? w ? , and ranges from x ? 0 ? to x ? 1 ? . dmu_a x ? 8810 0##w ? dmu_b x ? 8811 0##w ? dmu_c x ? 8812 0##w ? dmu_d x ? 8813 0##w ? wrap x ? 8814 0##w ? sa (47:16) 313029282726252423222120191817161514131211109876543210 sa (15:0) reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description sa(47:16) 31:0 not defined source address value. the dmu accesses an sa value when it performs egress frame alteration functions. the data is the source address that is either overlaid or inserted into a frame. the entry address is inferred from the contents of the fcb. field name bit(s) reset description sa(15:0) 31:16 not defined source address value. the dmu accesses an sa value when it performs egress frame alteration functions. the data is the source address that is either overlaid or inserted into a frame. the entry address is inferred from the contents of the fcb. reserved 15:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 502 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.29 dasl initialization and configuration 13.29.1 dasl configuration register (dasl_config) this register contains control information for the dasl-a and dasl-b interfaces. access type read/write base address x ? a000 0110 ? external_wrap_mode gs_mode gs_throttle switchover_init alt_ena pri_sync_term alt_sync_term sdm_priority sdm_use_primary sdm_use_alternate probe_priority probe_use_primary probe_use_alternate reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description external_wrap_mode 31 0 external wrap mode indicator value. 0 the network processor is configured for dasl connection to one or two switches supporting up to 64 target blades. 1 the network processor is configured for external dasl wrap con- nections. one or two network processors can be supported in this mode, restricting target blade addressing to the values of 0 or 1. gs_mode 30:29 01 grant status mode controls the setting of and response to grant status by a functional unit. 00 reserved. 01 deassert grant status when e-sdm is full. 10 deassert grant status one out of every n+1 times, where the value n is contained in the gs throttle field. 11 reserved. gs_throttle 28:22 x ? 00 ? grant status throttle when gs mode is set to '10', gs throttle controls the rate at which grant status is deasserted. that rate is once every ? gs_throttle + 1 ? cell times. switchover_init 21 0 switchover initialization 0nop 1 switchover initialization re-starts the primary dasl interface (nor- mal use is in response to a switchover event). alt_ena 20 0 alternate dasl enable control flag. 0nop 1 set by swi to enable the alternate dasl port. then the swi starts sending sync cells and the receiver searches the attain synchronization with the incoming serial stream. pri_sync_term 19 0 primary dasl synchronization termination 0nop 1 stops the np4gs3 ? s primary dasl interface from sending the synchronization pattern that completes the dasl synchroniza- tion sequence on the primary dasl interface.
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 503 of 554 alt_sync_term 18 0 alternate dasl synchronization termination 0nop 1 stops the np4gs3 ? s alternate dasl interface from sending the synchronization pattern that completes the dasl synchroniza- tion sequence on the alternate dasl interface. sdm_priority 17 1 i-sdm priority configures the arbitration priority of i-sdm accesses to the dasl interface. 0highpriority 1 low priority sdm use primary 16 1 sdm use primary 0nop 1 configures the i-sdm traffic to use the primary dasl interface. sdm_use_alternate 15 0 sdm use alternate 0nop 1 configures the i-sdm traffic to use the alternate dasl interface. probe_priority 14 0 probe_priority configures the arbitration priority of the probe accesses to the dasl interface. 0highpriority 1 low priority probe_use_primary 13 0 probe use primary 0nop 1 configures the probe traffic to use the primary dasl interface. probe_use_alternate 12 1 probe use alternate 0nop 1 configures the probe traffic to use the alternate dasl interface. reserved 11:0 reserved field name bit(s) reset description
ibm powernp np4gs3 network processor preliminary configuration page 504 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.29.1.1 dynamic switch interface selection the dasl configuration register uses designations such as primary and alternate. these designations refer to the use of one of the dasl interfaces a or b. the correspondence of primary or alternate to a dasl inter- face changes based on the value of the switch_bna signal io as shown in the following table: selection of dasl interface occurs on a frame by frame basis for frames from either the internal i-sdm inter- face or the internal probe interface (np4gs3 datasheet, section 5). three configuration bits and two calcu- lated bits are used for this determination. the three configuration bits are: xxx_use_primary configuration bits 16 and 13 (i-sdm and probe) of the dasl configuration register xxx_use alternate configuration bits 15 and 12 (i-sdm and probe) of the dasl configuration register alt_enabled configuration bit 20 of the dasl configuration register where xxx designates either the sdm or the probe. the calculated bits, local tb and remote tb, are the result of comparing the local tb vector (all frames in 16 blade mode and unicast frames for 64 blade modes) and the local mc target blade vector (for multicast frames when in 64 blade mode) with the tb field of the fcb for the frame that is being transmitted towards the switch interface. for 16 blade mode when the bit location of a 1 in the tb field matches a 1 in the same bit location in the local tb vector, or for 64 blade mode when the bit position corresponding to the encoded value in the tb field is a 1 in the local tb vector, or for 64 blade mode when the value is smaller than the value in the local mc target blade vector, a local tb is indicated and the local tb bit is set to 1. for 16 blade mode when the bit location of a 1 in the tb field matches a 0 in the same bit location in the local tb vector or for 64 blade mode when the bit position corresponding to the encoded value in the tb field is a 0 in the local tb vector, or for 64 blade mode when the value is larger than or equal to the value in the local mc target blade vector, a remote tb is indicated and the remote tb bit is set to 1. note that these are mutually exclusive in 64 blade mode. however, in 16 blade mode, both local and remote canbe1or0atthesametime. switch_bna primary dasl alternate dasl 0 dasl a dasl b 1 dasl b dasl a
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 505 of 554 the following table defines the actions taken for all combinations for these inputs. alt_enabled use primary use alternate local tb remote tb destination selected 0----primary 111 - -both 1- -11both 1 - 0 0 - primary 1 - 0 1 0 alternate 1010 -alternate 10110primary
ibm powernp np4gs3 network processor preliminary configuration page 506 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.29.2 dasl bypass and wrap register (dasl_bypass_wrap) this register controls the internal wrap path which bypasses the dasl interface. access type read/write base address x ? a002 0080 ? bypass_ wrap_ena b a reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description bypass_wrap_ena 31:30 00 dasl bypass and wrap control enables or disables the dasl bypass and internal wrap path. bit 31 controls the wrap for the dasl-b interface. bit 30 controls the wrap for the dasl-a interface. 0 wrap disabled 1 wrap enabled reserved 29:0 reserved
ibm powernp np4gs3 preliminary network processor np3_dl_sec13_config.fm.08 may 18, 2001 configuration page 507 of 554 13.29.3 dasl start register (dasl_start) this configuration register initializes the dasl interfaces when the dasl_init field makes a transition from ? 0 ? to ? 1 ? . the completion of dasl initialization is reported in the init_done register. access type read/write base address x ? a000 0210 ? dasl_init reserved 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description dasl_init 31 0 dasl initialization control value. switching this value from ? 0 ? to ? 1 ? causes the dasl interfaces to initialize. reserved 30:0 reserved
ibm powernp np4gs3 network processor preliminary configuration page 508 of 554 np3_dl_sec13_config.fm.08 may 18, 2001 13.30 programmable i/o register (pio_reg) (np4gs3b (r2.0)) this configuration register provides the control for the pio signal pins (see np4gs3 datasheet section 2.1.9, miscellaneous pins). access type read/write base address x ? a040 4000 ? reserved pio_state pio_enable pio_write 313029282726252423222120191817161514131211109876543210 field name bit(s) reset description reserved 30:9 reserved pio_state 8:6 current value on the pio(2:0) signal pins. pio signal pin mapping to reg- ister bits is: bit pio 8pio(2) 7pio(1) 6pio(0) pio_enable 5:3 controls pio(2:0) driver state. control values are: 0 driver is in tristate 1 driver is enabled. pio signal pin mapping to register bits is: bit pio 5pio(2) 4pio(1) 3pio(0) pio_write 2:0 value to be driven onto pio(2:0) signal pins when the corresponding driver is enabled. pio signal pin mapping to register bits is: bit pio 2pio(2) 1pio(1) 0pio(0)
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 509 of 554 14. electrical and thermal specifications the np4gs3 utilizes ibm cmos sa-27e technology. table 14-1. absolute maximum ratings symbol parameter rating units notes 1.8 v v dd power supply voltage 1.95 v 1 t a operating temperature (ambient) -40 to +100
c1 t j junction temperature -40 to +125
c1 t stg storage temperature -65 to +150
c1 1. stresses greater than those listed under ? absolute maximum ratings ? may cause permanent damage to the device. this is a stress rating only and functional operation of the device at these or any other conditions above those indicated in the operational sections of this specification is not implied. exposure to absolute maximum rating conditions for extended periods may affect reli- ability. table 14-2. input capacitance (pf) (page 1 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap a02 sch_addr(11) 11.31 a03 spare_tst_rcvr(2) 9.37 a04 sch_addr(02) 11.01 a05 switch_clk_b 8.67 a06 sch_data(08) 10.81 a07 sch_addr(00) 10.21 a08 sch_data(15) 10.31 a09 sch_data(07) 10.01 a10 sch_data(00) 9.91 a11 d4_addr(01) 9.81 a12 dd_ba(0) 10.01 a13 d4_data(31) 8.8 a14 d4_data(24) 8.5 a15 d4_data(19) 8.7 a16 d4_data(11) 8.5 a17 d4_data(04) 8.7 a18 ds1_data(26) 8.5 a19 ds1_data(19) 8.8 a20 ds1_data(14) 8.7 a21 ds1_data(06) 9 a22 ds1_data(00) 9.1 a23 ds0_addr(10) 10.21
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 510 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 a24 ds0_dqs(2) 9 a25 ds0_data(28) 9.2 a26 ds0_data(20) 9.5 a27 ds0_data(14) 9.5 a28 ds0_addr(00) 11.41 a29 ds0_data(09) 10 a30 ds0_data(13) 10.4 a31 ds0_cs 11.81 a32 ds0_data(01) 10.8 a33 ds0_data(03) 11.7 aa03 dasl_in_a(7) 6.5 aa04 dasl_in_a(6) 5.9 aa05 lu_addr(17) 7.31 aa06 lu_data(23) 7.71 aa07 lu_data(28) 7.51 aa08 lu_data(29) 7.01 aa09 lu_addr(00) 6.81 aa10 lu_addr(02) 6.31 aa11 lu_addr(18) 5.81 aa12 d0_data(00) 4.8 aa14 d0_addr(04) 6.11 aa15 d1_data(06) 5 aa16 d3_data(01) 5.1 aa17 d3_addr(06) 6.21 aa18 d2_data(03) 5 aa19 d2_addr(12) 6.21 aa20 da_ba(1) 6.1 aa22 jtag_tck 5.85 aa23 jtag_tdo 6.15 aa25 dmu_a(28) 7.05 aa26 dmu_a(27) 7.25 aa27 dmu_a(26) 7.15 aa28 dmu_a(25) 7.55 aa29 dmu_a(24) 7.85 aa30 dmu_a(23) 8.45 aa31 dmu_a(22) 8.95 table 14-2. input capacitance (pf) (page 2 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 511 of 554 aa32 dmu_a(21) 9.55 aa33 dmu_a(20) 9.95 ab03 dasl_in_a(6) 6.4 ab05 lu_addr(14) 7.81 ab07 lu_addr(03) 7.51 ab09 lu_addr(16) 6.91 ab11 d0_data(03) 4.7 ab13 d0_data(28) 5 ab15 d0_addr(05) 6.11 ab17 d1_addr(00) 6.21 ab19 d2_dqs(0) 5.1 ab21 d6_data(04) 5 ab23 c405_debug_halt 6.15 ab25 pci_ad(02) 8.55 ab27 pci_ad(01) 8.55 ab29 pci_ad(00) 9.75 ab31 dmu_a(30) 8.85 ab33 dmu_a(29) 9.95 ac01 lu_addr(07) 9.41 ac02 dasl_in_a(5) 7 ac03 dasl_in_a(5) 6.6 ac04 lu_addr(11) 8.41 ac08 lu_r_wrt 7.51 ac09 lu_addr(04) 7.01 ac11 d0_data(13) 5 ac12 d0_dqs(1) 5 ac14 d0_addr(10) 6.31 ac15 d0_data(29) 5.1 ac16 d1_data(15) 5.1 ac17 d3_dqs(1) 5.1 ac18 d3_addr(07) 6.41 ac20 d2_addr(05) 6.51 ac21 d6_data(10) 5.1 ac22 d6_data(15) 5.1 ac23 pci_bus_m_int 7.45 ac24 pci_ad(11) 8.15 table 14-2. input capacitance (pf) (page 3 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 512 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 ac25 pci_ad(10) 8.55 ac26 pci_ad(09) 8.45 ac27 pci_ad(08) 8.65 ac28 pci_cbe(0) 9.25 ac29 pci_ad(07) 9.65 ac30 pci_ad(06) 10.35 ac31 pci_ad(05) 10.45 ac32 pci_ad(04) 10.85 ac33 pci_ad(03) 11.35 ad01 dasl_in_a(4) 7.6 ad03 dasl_in_a(4) 6.8 ad07 lu_addr(12) 7.71 ad09 d0_data(01) 5.9 ad11 d0_data(23) 5.2 ad13 db_ba(1) 6.5 ad15 d1_cs 6.61 ad19 d3_we 6.81 ad21 d2_addr(06) 6.81 ad25 pci_cbe(1) 8.55 ad27 pci_ad(15) 9.45 ad29 pci_ad(14) 9.95 ad31 pci_ad(13) 10.65 ad33 pci_ad(12) 11.45 ae01 dasl_in_a(3) 7.7 ae02 dasl_in_a(1) 7.2 ae03 dasl_in_a(3) 6.9 ae06 lu_addr(05) 8.61 ae07 lu_addr(06) 7.61 ae08 lu_addr(15) 7.71 ae09 d0_data(11) 5.9 ae10 d0_data(22) 5.9 ae11 d0_dqs(2) 5.8 ae12 d1_data(00) 5.7 ae13 db_ras 6.7 ae14 d1_addr(07) 6.91 ae15 d3_data(03) 5.8 table 14-2. input capacitance (pf) (page 4 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 513 of 554 ae16 d3_data(09) 5.7 ae17 d2_data(08) 6.3 ae18 d3_addr(05) 7.11 ae19 d3_addr(12) 7.21 ae20 d2_data(07) 6 ae21 d2_data(09) 6 ae22 d6_data(14) 6 ae23 d6_data(09) 6.1 ae24 d6_addr(02) 7.3 ae25 mgrant_b(1) 7.15 ae26 pci_frame 8.75 ae27 pci_irdy 9.65 ae28 pci_trdy 10.25 ae29 pci_devsel 10.05 ae30 pci_stop 10.55 ae31 pci_perr 10.75 ae32 pci_serr 11.05 ae33 pci_par 11.55 af01 dasl_in_a(1) 8 af07 lu_addr(13) 7.91 af09 d0_data(20) 6.5 af11 d0_addr(12) 7.51 af13 d1_addr(06) 7.21 af15 d3_data(14) 6.1 af17 d3_addr(02) 7.01 af19 d3_data(15) 6.2 af21 d6_dqs_par(01) 6.3 af23 d6_byteen(1) 7.7 af25 d6_addr(04) 7.5 af27 pci_ad(17) 9.55 af29 pci_ad(16) 10.35 af31 pci_cbe(2) 11.15 af33 pci_clk 11.85 ag03 d0_data(09) 8 ag05 lu_addr(09) 10.01 ag06 lu_addr(10) 8.81 table 14-2. input capacitance (pf) (page 5 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 514 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 ag07 d0_data(02) 6.7 ag08 d0_data(21) 6.6 ag09 d0_dqs(0) 6.5 ag10 d0_addr(11) 7.71 ag11 db_ba(0) 6.4 ag12 d1_addr(05) 7.61 ag13 d3_data(00) 6.5 ag14 d3_data(02) 6.5 ag15 d3_data(13) 6.4 ag16 d3_data(10) 6.4 ag17 d3_data(12) 6.2 ag18 d3_addr(04) 7.71 ag19 d3_addr(00) 7.71 ag20 d3_dqs(0) 6 ag21 d6_addr(11) 7.1 ag22 d2_data(10) 6.1 ag23 d2_addr(07) 7.61 ag24 d6_byteen(0) 8.2 ag25 da_ras 8.2 ag26 d6_addr(03) 8.3 ag27 mgrant_b(0) 8.25 ag28 pci_ad(23) 10.45 ag29 pci_ad(22) 11.65 ag30 pci_ad(21) 10.95 ag31 pci_ad(20) 11.05 ag32 pci_ad(19) 11.75 ag33 pci_ad(18) 11.85 ah01 dasl_in_a(2) 8.7 ah03 dasl_in_a(2) 7.9 ah05 de_ba(1) 8.61 ah07 d0_data(10) 7.7 ah09 d0_dqs(3) 6.7 ah11 d1_data(11) 6.3 ah13 d1_data(14) 6.2 ah15 d1_addr(11) 7.41 ah17 d3_data(11) 6.1 table 14-2. input capacitance (pf) (page 6 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 515 of 554 ah19 db_clk 7.4 ah21 d2_addr(01) 7.71 ah23 d2_addr(04) 8.01 ah25 d6_data(08) 7.3 ah27 d6_addr(12) 9.6 ah29 pci_ad(24) 10.25 ah31 pci_cbe(3) 11.65 ah33 pci_idsel 12.45 aj02 dasl_in_a(0) 8.5 aj03 lu_clk 9.51 aj04 de_clk 9 aj05 de_clk 8.7 aj06 d0_data(14) 8.6 aj07 d0_data(16) 7.4 aj08 d0_data(31) 7.2 aj09 d0_addr(02) 8.31 aj10 d1_data(03) 6.9 aj11 d1_data(07) 6.4 aj12 db_cas 7.8 aj13 d1_addr(01) 7.21 aj14 d1_dqs(0) 6.7 aj15 d1_addr(12) 7.91 aj16 d3_addr(01) 7.61 aj17 d3_addr(03) 7.81 aj18 db_clk 7.4 aj19 d2_data(01) 6.8 aj20 d2_data(04) 6.9 aj21 d2_data(14) 7.1 aj22 d2_addr(00) 8.41 aj23 d2_addr(11) 8.61 aj24 d2_we 8.61 aj25 d6_we 8.7 aj26 d6_data(12) 7.9 aj27 d6_addr(07) 9.1 aj28 d6_addr(09) 10.4 aj29 pci_ad(29) 10.65 table 14-2. input capacitance (pf) (page 7 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 516 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 aj30 pci_ad(28) 10.95 aj31 pci_ad(27) 11.35 aj32 pci_ad(26) 12.15 aj33 pci_ad(25) 12.25 ak01 dasl_in_a(0) 9.3 ak03 de_ras 9.91 ak05 d0_data(15) 8 ak07 d0_data(30) 8.1 ak09 d1_data(04) 7.7 ak11 d1_data(08) 7.2 ak13 d1_addr(02) 8.41 ak15 d3_data(07) 7.3 ak17 d3_addr(11) 8.11 ak19 d3_addr(08) 8.71 ak21 d2_data(13) 7.5 ak23 d2_addr(10) 8.81 ak25 d2_dqs(1) 8.3 ak27 d6_data(13) 8.7 ak29 d6_addr(08) 9.8 ak31 pci_ad(31) 11.65 ak33 pci_ad(30) 12.95 al01 spare_tst_rcvr(4) 8.25 al02 de_cas 10.61 al03 d0_data(12) 9.2 al04 d0_data(08) 8.8 al05 switch_clk_a 7.87 al06 d0_data(26) 8.5 al07 d0_data(19) 8.2 al08 d0_addr(07) 9.11 al09 d0_data(25) 7.7 al10 d0_addr(00) 9.01 al11 d0_addr(09) 6.61 al12 d1_data(02) 7.7 al13 d1_data(09) 6.6 al14 d1_addr(09) 8.81 al15 d3_data(06) 5.1 table 14-2. input capacitance (pf) (page 8 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 517 of 554 al16 d1_we 8.91 al17 d3_data(04) 7.5 al18 d3_cs 8.91 al19 d3_addr(09) 9.11 al20 d2_data(05) 7.7 al21 d2_addr(09) 8.91 al22 d2_cs 9.21 al23 d6_data(00) 7.9 al24 d6_data(07) 8.3 al25 da_clk 9.2 al26 d6_data(02) 8.4 al27 d6_addr(05) 9.8 al28 d6_addr(00) 10.2 al29 d6_addr(10) 10.3 al30 d6_dqs(0) 9.5 al31 d6_cs 11 al32 pci_grant 12.35 al33 pci_request 13.05 am01 de_ba(0) 11.31 am03 d0_data(05) 9.5 am05 d0_data(27) 9.4 am07 d0_addr(06) 9.91 am09 d0_we 9.71 am11 d0_addr(08) 9.51 am13 d1_data(12) 8 am15 d1_addr(03) 9.21 am17 d3_data(05) 8.2 am19 d2_data(12) 8 am21 d2_addr(03) 9.41 am23 d6_data(01) 8.6 am25 da_clk 9.9 am27 d6_data(03) 9.3 am29 d6_parity(00) 10 am31 d6_dqs(3) 10.2 am33 pci_inta 13.05 an01 d0_data(06) 9.3 table 14-2. input capacitance (pf) (page 9 of 21) (t a =25 c,f=1mhz,v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 518 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 an02 d0_data(04) 10.1 an03 d0_data(07) 9.9 an04 d0_data(17) 9.8 an05 switch_clk_a 7.77 an06 d0_addr(03) 10.81 an07 d0_data(18) 9 an08 d0_data(24) 9.1 an09 d0_cs 9.41 an10 d0_addr(01) 9.91 an11 d1_data(01) 7.4 an12 d1_data(10) 8.8 an13 d1_data(13) 6.4 an14 d1_addr(04) 9.71 an15 d1_addr(08) 9.91 an16 d1_dqs(1) 8.5 an17 d3_addr(10) 9.91 an18 d2_data(00) 8.5 an19 d2_data(06) 8.8 an20 d2_data(11) 8.7 an21 d2_addr(02) 10.21 an22 d2_addr(08) 10.31 an23 d6_parity(01) 9 an24 d6_data(06) 9 an25 d6_data(11) 9.2 an26 d6_addr(01) 10.5 an27 da_cas 10.5 an28 d6_data(05) 10.2 an29 da_ba(0) 11 an30 d6_addr(06) 11.4 an31 d6_dqs(1) 10.6 an32 d6_dqs_par(00) 10.8 an33 d6_dqs(2) 11.7 b01 sch_addr(16) 11.31 b03 sch_addr(13) 10.71 b05 sch_addr(01) 10.61 b07 d4_addr(12) 9.91 table 14-2. input capacitance (pf) (page10of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 519 of 554 b09 sch_data(06) 9.71 b11 d4_addr(07) 9.51 b13 dd_clk 9 b15 d4_data(25) 8 b17 ds1_dqs(0) 8.2 b19 ds1_data(13) 8 b21 ds1_data(05) 8.2 b23 ds0_addr(05) 9.81 b25 ds0_data(29) 8.9 b27 ds0_addr(03) 10.51 b29 ds0_data(23) 10 b31 ds0_data(02) 10.2 b33 clock125 11.65 c01 sch_addr(17) 11.31 c02 sch_addr(14) 10.61 c03 sch_addr(09) 10.41 c04 sch_addr(12) 10.01 c05 switch_clk_b 7.87 c06 sch_data(12) 9.71 c07 sch_clk 9.41 c08 d4_addr(09) 9.11 c09 sch_data(14) 8.91 c10 sch_data(01) 9.01 c11 d4_addr(06) 8.81 c12 d4_addr(00) 8.91 c13 dd_clk 8.4 c14 d4_data(17) 7.6 c15 d4_data(03) 7.7 c16 d4_data(10) 7.7 c17 ds1_we 8.61 c18 ds1_data(27) 7.7 c19 ds1_dqs(1) 7.9 c20 ds1_data(21) 7.7 c21 dc_ras 8.7 c22 ds0_addr(11) 9.21 c23 ds0_addr(06) 9.11 table 14-2. input capacitance (pf) (page11of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 520 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 c24 ds0_dqs(1) 8.3 c25 ds0_data(21) 8.2 c26 ds0_addr(04) 9.61 c27 ds0_data(15) 8.8 c28 ds0_data(22) 9.2 c29 ds0_data(08) 9.3 c30 ds0_data(04) 9.5 c31 ds0_data(05) 10 c32 operational 10.95 c33 core_clock 11.65 d01 dasl_in_b(0) 9.3 d03 sch_addr(15) 9.91 d05 sch_addr(10) 9.21 d07 sch_data(13) 9.31 d09 d4_addr(08) 8.91 d11 d4_dqs(0) 7.2 d13 d4_data(26) 7.2 d15 d4_data(02) 7.3 d17 d4_data(05) 7 d19 ds1_dqs(2) 7.5 d21 ds1_data(12) 7.5 d23 dc_clk 8.6 d25 dc_ba(0) 9.3 d27 ds0_data(26) 8.7 d29 ds0_data(11) 8.8 d31 dmu_d(01) 10.25 d33 dmu_d(00) 11.55 e02 dasl_in_b(0) 8.5 e03 spare_tst_rcvr(1) 7.81 e04 sch_addr(05) 9.21 e05 sch_addr(18) 8.91 e06 sch_addr(03) 9.81 e07 sch_addr(04) 8.61 e08 sch_data(09) 8.41 e09 d4_data(18) 7.1 e10 d4_cs 8.11 table 14-2. input capacitance (pf) (page12of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 521 of 554 e11 d4_dqs(1) 6.9 e12 d4_data(29) 6.8 e13 d4_data(27) 6.8 e14 d4_data(16) 6.7 e15 d4_data(12) 6.7 e16 ds1_cs 7.61 e17 ds1_addr(03) 7.81 e18 ds1_data(20) 6.4 e19 ds1_data(25) 6.8 e20 ds1_data(22) 6.9 e21 ds1_data(11) 7.1 e22 ds1_data(08) 7.2 e23 dc_clk 8.4 e24 ds0_addr(12) 8.61 e25 ds0_dqs(3) 7.7 e26 ds0_data(27) 7.9 e27 ds0_data(12) 8.1 e28 ds0_data(10) 9.4 e29 blade_reset 9.25 e30 dmu_d(04) 9.55 e31 dmu_d(29) 9.95 e32 dmu_d(12) 10.75 f01 dasl_in_b(2) 8.7 f03 dasl_in_b(2) 7.9 f05 sch_addr(07) 8.61 f07 sch_addr(08) 8.91 f09 sch_data(02) 7.91 f11 d4_we 7.51 f13 d4_data(30) 6.2 f15 d4_data(13) 6.2 f17 ds1_addr(10) 7.31 f19 ds1_data(24) 6.4 f21 ds1_data(07) 6.5 f23 ds1_data(04) 6.8 f25 ds0_dqs(0) 7.3 f27 ds0_data(06) 8.6 table 14-2. input capacitance (pf) (page13of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 522 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 f29 dmu_d(07) 8.85 f31 dmu_d(06) 10.25 f33 dmu_d(05) 11.05 g02 dasl_in_b(3) 8 g03 spare_tst_rcvr(5) 7.51 g04 dasl_in_b(3) 7.3 g05 sch_r_wrt 10.01 g06 sch_data(10) 8.81 g07 sch_data(11) 7.91 g08 sch_data(17) 7.81 g09 sch_data(05) 7.71 g10 d4_addr(04) 7.71 g11 dd_cas 7.81 g12 d4_data(22) 6.4 g13 d4_data(08) 6.5 g14 d4_data(07) 6.5 g15 ds1_addr(08) 7.61 g16 ds1_addr(11) 7.61 g17 ds1_addr(09) 7.41 g18 ds1_addr(02) 7.71 g19 ds1_addr(05) 7.71 g20 ds1_data(30) 6 g21 ds1_data(29) 6.1 g22 ds1_data(15) 6.1 g23 ds1_data(01) 6.4 g24 ds0_addr(07) 8.41 g25 ds0_data(30) 7.2 g26 ds0_data(17) 7.3 g27 pci_bus_nm_int 9.65 g28 dmu_d(02) 9.05 g29 dmu_d(11) 10.25 g30 dmu_d(10) 9.55 g31 dmu_d(30) 9.65 g32 dmu_d(08) 10.35 h01 dasl_in_b(1) 8 h07 sch_addr(06) 7.91 table 14-2. input capacitance (pf) (page14of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 523 of 554 h09 d4_data(06) 6.5 h11 d4_addr(03) 7.51 h13 d4_data(23) 6 h15 ds1_addr(07) 7.31 h17 ds1_addr(04) 7.11 h19 ds1_addr(06) 7.41 h21 ds0_data(07) 6.3 h23 ds0_addr(08) 7.91 h25 ds0_data(16) 6.5 h27 dmu_d(16) 8.15 h29 dmu_d(15) 8.95 h31 dmu_d(14) 9.75 h33 dmu_d(13) 10.45 j01 dasl_in_b(4) 7.7 j02 dasl_in_b(1) 7.2 j03 dasl_in_b(4) 6.9 j06 mg_data 7.88 j07 mg_clk 7.28 j08 mg_nintr 6.98 j10 sch_data(16) 7.11 j11 sch_data(03) 7.01 j12 d4_addr(02) 6.91 j13 dd_ba(1) 6.91 j14 d4_data(21) 5.7 j15 ds1_data(17) 5.8 j16 ds1_addr(12) 6.91 j17 dc_ba(1) 6.9 j18 ds1_addr(01) 7.11 j19 ds1_data(31) 6 j20 ds1_data(18) 6 j21 ds1_data(16) 6 j22 ds0_addr(09) 7.21 j23 ds0_we 7.31 j24 ds0_data(18) 6.3 j25 dmu_d(25) 7.15 j26 dmu_d(24) 7.35 table 14-2. input capacitance (pf) (page15of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 524 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 j27 dmu_d(23) 8.35 j28 dmu_d(22) 8.85 j29 dmu_d(03) 8.65 j30 dmu_d(20) 9.15 j31 dmu_d(19) 9.35 j32 dmu_d(18) 9.65 j33 dmu_d(17) 10.15 k01 dasl_in_b(5) 7.6 k03 dasl_in_b(5) 6.8 k07 boot_picocode 6.98 k13 dd_ras 6.71 k15 d4_data(09) 5.4 k19 ds1_data(28) 5.6 k21 ds1_data(02) 5.6 k23 ds0_data(19) 5.5 k25 dmu_d(09) 7.15 k27 dmu_d(21) 8.05 k29 dmu_d(28) 8.55 k31 dmu_d(27) 9.25 k33 dmu_d(26) 10.05 l02 dasl_in_b(6) 7 l03 dasl_in_b(6) 6.6 l04 boot_ppc 7.68 l12 sch_data(04) 6.21 l13 d4_addr(05) 6.31 l15 d4_data(20) 5.1 l16 d4_data(01) 5.1 l17 ds1_data(10) 5.2 l18 ds1_data(09) 5.2 l19 ds0_data(25) 5.2 l20 ds1_data(03) 5.3 l22 ds0_data(31) 5.1 l23 dmu_c(10) 6.05 l24 dmu_c(09) 6.75 l25 dmu_c(08) 7.15 l26 dmu_c(07) 7.05 table 14-2. input capacitance (pf) (page16of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 525 of 554 l27 dmu_c(06) 7.25 l28 dmu_c(05) 7.85 l29 dmu_c(04) 8.25 l30 dmu_c(03) 8.95 l31 dmu_c(02) 9.05 l32 dmu_c(01) 9.45 l33 dmu_c(00) 9.95 m03 dasl_in_b(7) 6.4 m07 pci_speed 6.78 m13 d4_addr(10) 6.21 m15 d4_dqs(3) 4.9 m17 d4_data(00) 4.9 m19 ds0_addr(02) 6.31 m21 ds0_data(24) 5 m23 dmu_c(16) 6.15 m25 dmu_c(15) 7.15 m27 dmu_c(14) 7.15 m29 dmu_c(13) 8.35 m31 dmu_c(12) 8.85 m33 dmu_c(11) 9.95 n04 dasl_in_b(7) 5.9 n06 pio(2) 12.51 n08 pio(1) 12.31 n09 pio(0) 7.11 n14 d4_addr(11) 6.11 n15 d4_dqs(2) 5 n16 d4_data(15) 5.1 n17 ds1_addr(00) 6.31 n18 ds1_data(23) 5 n19 dc_cas 6 n20 ds0_addr(01) 6.31 n23 dmu_c(27) 6.15 n24 dmu_c(26) 6.55 n25 dmu_c(25) 7.05 n26 dmu_c(24) 7.25 n27 dmu_c(23) 7.15 table 14-2. input capacitance (pf) (page17of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 526 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 n28 dmu_c(22) 7.55 n29 dmu_c(21) 7.85 n30 dmu_c(20) 8.45 n31 dmu_c(19) 8.95 n32 dmu_c(18) 9.55 n33 dmu_c(17) 9.95 p15 d4_data(28) 5.2 p19 ds0_data(00) 5.3 p21 mc_grant_b(1) 5.55 p23 dmu_b(02) 6.15 p25 dmu_b(01) 7.05 p27 dmu_b(00) 7.15 p29 dmu_c(30) 8.15 p31 dmu_c(29) 8.75 p33 dmu_c(28) 9.85 r04 lu_addr(08) 8.01 r05 lu_data(33) 7.51 r07 lu_data(04) 7.21 r08 lu_data(05) 6.81 r17 d4_data(14) 5.6 r18 ds1_dqs(3) 5.6 r20 mc_grant_b(0) 5.65 r21 switch_bna 5.55 r22 dmu_b(18) 5.85 r23 dmu_b(13) 6.15 r24 dmu_b(12) 6.55 r25 dmu_b(11) 6.85 r26 dmu_b(10) 7.15 r27 dmu_b(09) 6.95 r28 dmu_b(08) 7.55 r29 dmu_b(07) 8.15 r30 dmu_b(06) 8.55 r31 dmu_b(05) 8.95 r32 dmu_b(04) 9.45 r33 dmu_b(03) 9.95 t01 spare_tst_rcvr(3) 7.52 table 14-2. input capacitance (pf) (page18of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 527 of 554 t03 spare_tst_rcvr(8) 6.61 t07 lu_data(03) 7.21 t09 lu_data(02) 6.61 t15 lu_data(24) 7.21 t19 send_grant_b 5.95 t21 res_data 5.55 t23 dmu_b(19) 6.15 t25 jtag_trst 6.85 t27 dmu_b(17) 6.95 t29 dmu_b(16) 7.65 t31 dmu_b(15) 8.95 t33 dmu_b(14) 9.75 u01 lu_data(30) 9.41 u03 lu_data(35) 8.11 u05 spare_tst_rcvr(0) 5.47 u06 testmode(1) 5.78 u08 lu_data(08) 6.61 u10 lu_data(34) 6.21 u12 lu_data(11) 6.11 u13 lu_data(01) 6.41 u15 lu_data(00) 7.11 u19 mgrant_a(1) 5.95 u21 res_sync 5.55 u22 jtag_tms 5.75 u23 rx_lbyte(0) 6.15 u24 dmu_b(29) 6.55 u25 dmu_b(28) 6.85 u26 dmu_b(27) 6.95 u27 dmu_b(26) 7.25 u28 dmu_b(25) 7.25 u29 dmu_b(24) 7.85 u30 dmu_b(23) 8.15 u31 dmu_b(22) 8.75 u32 dmu_b(21) 9.45 u33 spare_tst_rcvr(9) 8.12 v01 spare_tst_rcvr(6) 7.51 table 14-2. input capacitance (pf) (page19of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 528 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 v03 spare_tst_rcvr(7) 6.58 v05 testmode(0) 5.98 v07 lu_data(09) 7.21 v09 lu_data(10) 6.61 v11 lu_data(14) 5.91 v13 lu_data(18) 6.41 v15 lu_data(12) 7.21 v19 mgrant_a(0) 5.95 v21 i_freeq_th 5.55 v23 rx_lbyte(1) 6.15 v25 dmu_b(20) 6.85 v27 dmu_a(02) 6.95 v29 dmu_a(01) 7.65 v31 dmu_a(00) 8.95 v33 dmu_b(30) 9.75 w01 lu_data(20) 9.41 w04 lu_data(13) 8.01 w05 lu_data(07) 7.51 w06 lu_data(06) 7.71 w07 lu_data(15) 7.21 w08 lu_data(16) 6.81 w09 lu_data(21) 6.51 w10 lu_data(25) 6.31 w11 lu_data(31) 5.81 w12 lu_data(26) 6.01 w13 lu_data(19) 6.41 w14 lu_data(27) 6.81 w16 d1_addr(10) 6.91 w17 d3_data(08) 5.7 w18 d2_data(02) 5.6 w20 send_grant_a 5.65 w21 mc_grant_a(1) 5.55 w22 jtag_tdi 5.85 w23 dmu_a(13) 6.15 w24 dmu_a(12) 6.55 w25 dmu_a(11) 6.85 table 14-2. input capacitance (pf) (page20of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 529 of 554 w26 dmu_a(10) 7.15 w27 dmu_a(09) 6.95 w28 dmu_a(08) 7.55 w29 dmu_a(07) 8.15 w30 dmu_a(06) 8.55 w31 dmu_a(05) 8.95 w32 dmu_a(04) 9.45 w33 dmu_a(03) 9.95 y03 dasl_in_a(7) 6.2 y05 lu_data(32) 7.51 y07 lu_data(17) 7.41 y09 lu_data(22) 6.81 y11 lu_addr(01) 5.81 y15 d1_data(05) 5.2 y19 d2_data(15) 5.3 y21 mc_grant_a(0) 5.55 y23 dmu_a(19) 6.15 y25 dmu_a(18) 6.95 y27 dmu_a(17) 7.15 y29 dmu_a(16) 8.15 y31 dmu_a(15) 8.75 y33 dmu_a(14) 9.85 table 14-2. input capacitance (pf) (page21of21)(t a =25 c, f = 1 mhz, v dd =3.3v 0.3 v) grid position signal name total incap
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 530 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 table 14-3. operating supply voltages symbol parameter rating units notes min typ max v dd25 v dd3 v dd4 v dd5 2.5 v power supply 2.375 2.5 2.625 v 1 v dd33 v dd2 3.3 v power supply 3.135 3.3 3.465 v 1 v dd 1.8 v power supply 1.71 1.8 1.89 v 1 plla_v dd pllb_v dd pllc_v dd pll voltage reference 1.71 1.8 1.89 v 1, 2 v ref1(2-0) v ref2(8-0) sstl2 power supply (used for sstl2 i/o) 1.1875 1.25 1.3125 v 1 1. important power sequencing requirements: (the following conditions must be met at all times, including power-up and power-down: v ref *(1.25 v reference) v dd25 +0.4 v pll*_v dd (2.5 v reference) v dd33 +0.4 v v dd18 v dd25 +0.4v v dd25 v dd33 +0.4v v dd33 v dd25 +1.9v 2. see also pll filter circuit on page 80. table 14-4. thermal characteristics thermal characteristic min nominal max units notes estimated power dissipation 10.67 13 w np4gs3 r1.1 operating junction temperature (tj) 0 105
c1 1. operation up to t j = 125
c is supported for up to 3600 hours. however, the electromigration (em) limit must not exceed 105
cfor 88 kpoh equivalent. contact your ibm field applications engineer for em equivalents.
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 531 of 554 14.1 driver specifications table 14-5. definition of terms term definition maul maximum allowable up level. the maximum voltage that can be applied without affecting the specified reliability. cell functionality is not implied. maximum allowable applies to overshoot only. mpul maximum positive up level. the most positive voltage that maintains cell functionality. the maximum positive logic level. lpul least positive up level. the least positive voltage that maintains cell functionality. the minimum positive logic level. mpdl most positive down level. the most positive voltage that maintains cell functionality. the maximum negative logic level. lpdl least positive down level. the least positive voltage that maintains cell functionality. the minimum negative logic level. madl minimum allowable down level. the minimum voltage that can be applied without affecting the specified reliability. mini- mum allowable applies to undershoot only. cell functionality is not implied. table 14-6. 1.8 v cmos driver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 cmos v dd 3 +0.45 v dd 3 v dd 3 - 0.45 0.45 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd ranges as specified in table 14-3 (typical = 1.8 v). table 14-7. 1.8 v cmos driver minimum dc currents at rated voltage v dd =1.65v,t=100 c driver type v high (v) i high (ma) v low (v) i low (ma) cmos 50 ohm driver outputs 1.2 8.0/23.0 1 0.45 7.8 1 1. 23 ma is the electromigration limit for 100k power on hours (poh) = 100 c and 100% duty cycle. this limit can be adjusted for dif- ferent temperature, duty cycle, and poh. consult your ibm application engineer for further details. table 14-8. 2.5 v cmos driver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 cmos v dd 3 +0.6 v dd 3 2.0 0.4 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd ranges as specified in table 14-3 (typical = 2.5 v). table 14-9. 2.5 v cmos driver minimum dc currents at rated voltage v dd =2.3v,t=100 c driver type v high (v) i high (ma) v low (v) i low (ma) cmos 50 ohm driver outputs 2.0 5.2/23 1 0.4 6.9 1 1. 23 ma is the electromigration limit for 100k power on hours (poh) = 100 c and 100% duty cycle. this limit can be adjusted for dif- ferent temperature, duty cycle, and poh. consult your ibm application engineer for further details.
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 532 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 table 14-10. 3.3 v-tolerant 2.5 v cmos driver dc voltage specifications (see note 1) function maul (v) 2 mpul (v) 3 lpul (v) mpdl (v) lpdl (v) madl (v) 4 lvttl 3.9 v dd 5 2.0 0.4 0.00 -0.60 1. all levels adhere to the jedec standard jesd12-6, ? interface standard for semi-custom integrated circuits, ? march 1991. 2. maximum allowable applies to overshoot only. output disabled. 3. output active. 4. minimum allowable applies to undershoot only. 5. v dd ranges as specified in table 14-3 (typical = 2.5 v). table 14-11. 3.3 v lvttl driver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 lvttl v dd330 3 +0.3 v dd330 3 2.4 0.4 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd 33 ranges as specified in table 14-3 (typical = 3.3 v). table 14-12. 3.3 v lvttl/5.0 v-tolerant driver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 lvttl v dd330 3 +0.3 v dd330 3 2.4 0.4 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd 33 ranges as specified in table 14-3 (typical = 3.3 v). table 14-13. 3.3 v lvttl driver minimum dc currents at rated voltage (v dd =3.0v,t=100 c) driver type v high (v) i high (ma) v low (v) i low (ma) lvttl 50 ohm driver outputs 2.40 10.3/23 1 0.4 7.1 1 1. 23 ma is the electromigration limit for 100k power on hours (poh) = 100 c and 100% duty cycle. this limit can be adjusted for dif- ferent temperature, duty cycle, and poh. consult your ibm application engineer for further details.
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 533 of 554 14.2 receiver specifications table 14-14. 1.8 v cmos receiver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 cmos v dd 3 +0.45 v dd 3 0.65 v dd 3 0.35 v dd 3 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd ranges as specified in table 14-3 (typical = 1.8 v). table 14-15. 2.5 v cmos receiver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 cmos v dd 3 +0.6 v dd 1.7 0.70 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd ranges as specified in table 14-3 (typical = 2.5 v). table 14-16. 3.3 v lvttl receiver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 lvttl v dd330 3 +0.3 v dd330 3 2.00 0.80 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. 3. v dd 33 ranges as specified in table 14-3 (typical = 3.3 v). table 14-17. 3.3 v lvttl / 5 v tolerant receiver dc voltage specifications function maul (v) 1 mpul (v) lpul (v) mpdl (v) lpdl (v) madl (v) 2 lvttl 5.5 v 5.5 v 2.00 0.80 0.00 -0.60 1. maximum allowable applies to overshoot only. 2. minimum allowable applies to undershoot only. table 14-18. receiver maximum input leakage dc current input specifications function i il ( a) i ih ( a) without pull-up element or pull-down element 0 at v in =lpdl 0atv in =mpul with pull-down element 0 at v in = lpdl 200 at v in =mpul with pull-up element -150 at v in =lpdl 0atv in =mpul 1. see section 3.3 v lvttl / 5 v tolerant bp33 and ip33 receiver input current/voltage curve on page 534 .
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 534 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 figure 14-1. 3.3 v lvttl / 5 v tolerant bp33 and ip33 receiver input current/voltage curve 1. curve shows best case process - 0 c, 3.6v 0.00 - 50.00 -100.00 -150.00 -200.00 -250.00 -300.00 0.00 0.50 1.00 1.50 2.50 3.00 i pad ( a) v pad (v) 2.00
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 535 of 554 14.3 other driver and receiver specifications table 14-19. lvds receiver dc specifications symbol parameter min nom max units comments v dd device supply voltage 1.65 1.8 1.95 v receiver uses only v dd supply. temp temperature range 0 50 100
c rec pwr input buffer power 9.3 mw including on-chip terminator v pad - v padn =0.4v v ih receiver input voltage v dd + 0.20 v receiver esd connected to v dd v il receiver input voltage -0.20 v v ih - v il receiver input voltage range 100 mv @600 mhz v icm receiver common mode range 0 1.25 v dd v ri input impedance 80 100 120 notes: 1. all dc characteristics are based on power supply and temperature ranges as specified above. 2. this receiver using a v dd of 1.8 v nominal is compatible with the 1.5 v specification described in the lvds standard: ieee stan- dard for low-voltage differential signals (lvds) for scalable coherent interface (sci), ieee standard 1596.3,1996. 3. maximum frequency is load and package dependent. 600 mhz (1.2 gbps) is achievable with a minimum of 100 mv input swing over the wide common range as specified. the customer is responsible for determining optimal frequency and switching capabili- ties through thorough simulation and analysis. table 14-20. sstl2 dc specifications symbol parameter min nom max units comments v dd device supply voltage 1.65 1.8 1.95 v v ddq output supply voltage 2.3 2.5 2.7 v v ddq =v dd250 v tt termination voltage 1.11 - 1.19 1.25 1.31 - 1.39 v 0.5*v ddq v ref differential input reference voltage 1.15 1.25 1.35 v 0.5*v ddq v oh (class ii) output high voltage 1.95 v i oh = 15.2 ma @ 1.95 v v ol (class ii) output low voltage 0.55 v i ol = 15.2 ma @ 0.55 v r oh max (class ii) max pull-up impedance 36.2 notes: 1. all sstl2 specifications are consistent with jedec committee re-ballot (jc-16-97-58a), 10/14/97. 2. di/dt and performance are chosen by performance level selection (a and b). a. performance level a is targeted to run at 200 mhz or faster depending on loading conditions. di/dt is comparable to 110 ma/ns 2.5 v/3.3 v lvttl driver. b. performance level b is targeted to run at 250 mhz or faster depending on loading conditions. di/dt is comparable to 150 ma/ns 2.5 v/3.3 v lvttl driver. 3. the differential input reference supply (v ref ) is brought on chip through vsstl2r1 and vsstl2r2 i/o cells. 4. termination voltage (v tt ) is generated off-chip. 5. sstl2 driver is rated at 20 ma @100 c and 50% duty cycle for 100k power on hours (poh).
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 536 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001 r ol max (class ii) max pull-down impedance 36.2 v ih input high voltage v ref +0.18 v ddq +0.3 v v il input low voltage -0.3 v ref -0.18 v i oz 3-state leakage current 010 a driver hi-z temp temperature 0 50 100 c table 14-20. sstl2 dc specifications symbol parameter min nom max units comments notes: 1. all sstl2 specifications are consistent with jedec committee re-ballot (jc-16-97-58a), 10/14/97. 2. di/dt and performance are chosen by performance level selection (a and b). a. performance level a is targeted to run at 200 mhz or faster depending on loading conditions. di/dt is comparable to 110 ma/ns 2.5 v/3.3 v lvttl driver. b. performance level b is targeted to run at 250 mhz or faster depending on loading conditions. di/dt is comparable to 150 ma/ns 2.5 v/3.3 v lvttl driver. 3. the differential input reference supply (v ref ) is brought on chip through vsstl2r1 and vsstl2r2 i/o cells. 4. termination voltage (v tt ) is generated off-chip. 5. sstl2 driver is rated at 20 ma @100 c and 50% duty cycle for 100k power on hours (poh).
ibm powernp np4gs3 preliminary network processor np3_dl_sec14_elec.fm.08 may 18, 2001 electrical and thermal specifications page 537 of 554 14.3.1 dasl specifications table 14-21. dasl receiver dc specifications idasl_a symbol parameter min nom max units comments v dd device supply voltage 1.7 1.8 1.9 v receiver uses only v dd supply temp temperature range 0 50 100 c functional at 125 cwithem degradation rec pwr input buffer power 4.5 mw including on-chip pull-downs for pad and padn. v ih receiver input voltage v dd +0.20 v receiver esd connected to v dd v il receiver input voltage -0.20 v zlos starts to activate when pad and padn go below 0.6 v v ih -v il receiver input voltage range 200 mv v icm receiver common mode range 0.60 0.90 v dd - 0.4 v design notes: 1. all dc characteristics are based on power supply and temperature ranges as specified above. table 14-22. dasl driver dc specifications odasl_a symbol parameter min nom max units comments v dd device supply voltage 1.7 1.8 1.9 v driver uses only v dd supply temp temperature range 0 50 100 c functional at 125 cwithem degradation v oh output voltage high 1.25 1.35 1.45 v board termination of 50 to v dd /2 v ol output voltage low 0.39 0.42 0.45 v board termination of 50 to v dd /2 |v od | output differential voltage 0.80 0.93 1.06 v v oh -v ol v os output offset voltage 0.82 0.89 0.95 v (v oh +v ol )/2 r o output impedance, single-ended 42 52 design assumes 3 for package resistance to achieve 50 nominal. i load terminator current 0.80 0.90 10.5 ma board termination drv pwr output buffer power 17.5 mw excluding termination drv pwr (hi-z mode) output buffer power in hi-z mode 2 mw design notes: 1. all dc characteristics are based on power supply and temperature ranges as specified above and using board termination of 100 with approximately + 5% tolerance over supply and temperature. 2. maximum frequency is load and package dependent. operation of 625 mbps is achievable over a reasonable length of pcb or cable. the customer is responsible for determining optimal frequency and switching capabilities through thorough simulation and analysis using appropriate package and transmission line models.
ibm powernp np4gs3 network processor preliminary electrical and thermal specifications page 538 of 554 np3_dl_sec14_elec.fm.08 may 18, 2001
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 539 of 554 15. glossary of terms and abbreviations term definition alu arithmetic and logic unit api application programming interface arb arbitration ardl advance rope with optional delete leaf arp address resolution protocol ath actual threshold bcb buffer control block bci byte count information beb binary exponential back-off bfq buffer free queue bgp border gateway protocol bird an intermediate leaf. it occurs when the pscbline contains an intermediate lcba pointing to this leaf and an npa pointing to the next pscb. bl burst length blade any i/o card that can be inserted in a modular chassis (also called line card). bsm-ccga bottom surface metallurgy - ceramic column grid array byte 8 bits cab control access bus cba control block address chap challenge-handshake authentication protocol cia common instruction address clns connectionless-mode network service clp see core language processor. core language processor (clp) the picoprocessor core, also referred to as coprocessor 0. the clp executes the base instruction set and controls thread swapping and instruction fetching. cp control point cpf control point function cpix common programming interface
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 540 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 crc see cyclic redundancy check. cs control store csa control store arbiter cyclic redundancy check (crc) a system of error checking performed at both the sending and receiving station after a block-check character has been accumulated. da destination address dasl data-aligned synchronous link data store in the network processor, the place where a frame is stored while waiting for processing or forwarding to its destination. dbgs debug services ddr double data rate diffserv differentiated services distinguishing position (distpos) the index value of a first mismatch bit between the input key and the reference key found in a leaf pattern distpos see distinguishing position (distpos). dlq discard list queue dmu data mover unit doubleword 2 words dppu dyadic protocol processor unit dq discard queue dram dynamic random-access memory ds see data store. dscp diffserv code point dsi distributed software interface dt direct table dvmrp distance vector multicast routing protocol e- egress ecc error correcting code ecmp equal-cost multipath term definition
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 541 of 554 eds enqueuer / dequeuer / scheduler e-ds egress data store e-gdq discard queue stack. holds frames that need to be discarded. this is used by the hardware to discard frames when the egress ds is congested or to re-walk a frame marked for discard for a half duplex port. used by the picocode to discard frames that do not have header twins allocated. egress eds egress enqueuer / dequeuer / scheduler enet mac ethernet mac eof end-of-frame epc embedded processor complex e-pmm egress-physical mac multiplexer even parity data checking method where a parity bit is added to a data field. even parity is achieved when the number of bits (data + parity) that contain a ? 1 ? is even. ewma exponentially weighted moving average exponentially weighted average see exponentially weighted moving average. exponentially weighted moving average a method of smoothing a sequence of instantaneous measurements. typically, the average a(t) at time t is combined with the measurement m(t) at time t to yield the next average value: a(t + dt) = w * m(t) + (1 - w) * a(t) here weight w is a number with 0 < w
1. if the weight is 1, then the average is just the previous value of the measurement and no smoothing occurs. else previous values of m contribute to the current value of a with more recent m values being more influential. fcb frame control block ffa flexible frame alternation fhe frame header extension fhf frame header format fm full match fpga field-programmable gate array fta first twin-buffer address gdh see general data handler . term definition
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 542 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 general data handler (gdh) a type of thread used to forward frames in the epc. there are 28 gdh threads. general powerpc handler request (gph-req) a type of thread in the epc that processes frames bound to the embedded powerpc. work for this thread is usually the result of a reenqueue action to the ppc queue (it processes data frames when there are no entries to process in the ppc queue). general powerpc handler response (gph-resp) a type of thread in the epc that processes responses from the embedded powerpc. work for this thread is dispatched due to an interrupt and does not use dispatcher memory. general table handler (gth) the gth executes commands not available to a gdh or gfh thread, including hardware assist to perform tree inserts, tree deletes, tree aging, and rope management. it processes data frames when there are no tree management functions to perform. gfh see guided frame handler . gfq guided frame queue gmii gigabit media-independent interface gph-req see general powerpc handler request . gph-resp see general powerpc handler response . gpp general-purpose processor gpq powerpc queue. the queue that contains frames re-enqueued for delivery to the gph for processing. gpr general-purpose register gth see general table handler . gui graphical user interface guided frame handler (gfh) there is one gfh thread available in the epc. a guided frame can be processed only by the gfh thread, but it can be configured to enable it to process data frames like a gdh thread. the gfh executes guided frame- related picocode, runs device management-related picocode, and exchanges control information with a control point function or a remote network processor. when there is no such task to perform and the option is enabled, the gfh can execute frame forwarding-related picocode. gxh generalized reference to any of the defined threads of the epc halfword 2 bytes hc hardware classifier hdlc see high-level data link control . term definition
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 543 of 554 high-level data link control (hdlc) in data communication, the use of a specified series of bits to control data links in accordance with the international standards for hdlc: iso 3309 frame structure and iso 4335 elements of procedures. i- ingress icmp internet control message protocol i-ds ingress data store ingress eds ingress enqueuer / dequeuer / scheduler ingress gdq ingress general data queue ims interface manager services ipc interprocess communication i-pmm ingress-physical mac multiplexer ipps internet protocol version 4 protocol services ipv4 internet protocol version 4 i-sdm ingress switch data mover k2 10 , or 1024 in decimal notation kb kilobyte kb kilobit lcba leaf control block address ldp label distribution protocol leaf a control block that contains the corresponding key as a reference pattern and other user data such as target blade number, qos, and so on. lh latched high (a characteristic of a register or a bit in a register (facility)). when an operational condition sets the facility to a value of 1, the changes to the oper- ational condition do not affect the state of the facility until the facility is read. lid lookup identifier ll latched low (a characteristic of a register or a bit in a register (facility)). when an operational condition sets the facility to a value of 0, the changes to the opera- tional condition do not affect the state of the facility until the facility is read. lls low-latency sustainable bandwidth lpm longest prefix match lsb least significant bit term definition
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 544 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 lsb least significant byte lsp label-switched path lu_def look-up definition m 2 20 , or 1 048 576 in decimal notation mac medium access control management information base (mib) in osi, the conceptual repository of management information within an open system. maximum burst size (mbs) in the network processor egress scheduler, the duration a flow can exceed its guaranteed minimum bandwidth before it is constrained to its guaranteed mini- mum bandwidth. maximum transmission unit (mtu) in lans, the largest possible unit of data that can be sent on a given physical medium in a single frame. for example, the mtu for ethernet is 1500 bytes. mbs see maximum burst size. mcc multicast count mcca multicast count address mh mid handler mib see management information base. mid multicast id mm mid manager mms mid manager services mpls multiprotocol label switching mpps million packets per second msb most significant bit msb most significant byte msc message sequence charts mtu see maximum transmission unit. my_tb my target blade nbt next bit to test nfa next frame control block (fcb) address term definition
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 545 of 554 nla next leaf address nls normal latency sustainable bandwidth np network processor npa next pcsb address pointer npasm ibm network processor assembler npdd network processor device driver npddis network processor device driver initialization services npms network processor manager services npscope ibm network processor debugger npsim ibm network processor simulator nptest ibm network processor test case generator ns nanosecond nta next twin-buffer address oqg output queue grant pap password authentication protocol pbs peak bandwidth service pc program counter pcb port control block pcs physical coding sublayer pct port configuration table phy see physical layer . may also refer to a physical layer device. physical layer in the open systems interconnection reference model, the layer that provides the mechanical, electrical, functional, and procedural means to establish, main- tain, and release physical connections over the transmission medium. (t) plb processor local bus pll phased lock loop pma physical medium attachment pmm physical mac multiplexer polcb policing control block term definition
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 546 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 pos packetoversonet(alsoipoversonet) post power-on self-test postcondition an action or series of actions that the user program or the npdd must perform after the function has been called and completed. for example, when the func- tion that defines a table in the npdd has been completed, the npdd must dispatch a guided frame from the powerpc core to instruct one or more epcs to define the table. ppc powerpc ppp point-to-point protocol pprev previous discard probability precondition a requirement that must be met before the user program calls the api function. for example, a precondition exists if the user program must call one function and allow it to be completed before a second function is called. one function that has a precondition is the function that deregisters the user program. the user program must call the register function to obtain a user_handle before calling the deregistering function. ps picosecond pscb pattern search control block ptl physical transport layer pts physical transport services quadword 4 words quality of service (qos) for a network connection, a set of communication characteristics such as end- to-end delay, jitter, and packet loss ratio. qcb queue control block qos see quality of service. rbuf raw buffer rcb reassembly control block rclr read current leaf from rope red random early detection rmon remote network monitoring rope leaves in a tree linked together in a circular list rpc remote procedure call term definition
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 547 of 554 rtp real-time transport protocol rum access type ? reset under mask ? r/w access type ? read/write ? rwr access type ? read with reset ? sa source address sap service access point sc self-clearing (a characteristic of a register or a bit in a register (facility)). when written to a value of 1, will automatically reset to a value of 0 when the indicated operation completes. scb scheduler control block sci switch cell interface sdc shared dasl controller sdm switch data mover smt software-managed tree smii serial media-independent interface sof start-of-frame sonet synchronous optical network spm serial/parallel manager sram static random-access memory ss system services sum access type ? set under mask ? swi switch interface target blade grant (tbg) an internal facility used by the ingress scheduler to determine which target blades are accepting data transfer. derived from the output queue grant (oqg) information the network processor receives. tb target blade tbg see target blade grant . tbi ten-bit interface tbr target blade running tb_sof target blade start-of-frame term definition
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 548 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 tcp transmission control protocol tdmu target dmu thread a stream of picocode instructions that utilizes a private set of registers and shared computational resources within a dppu. in the network processor, a dppu provides both shared and private resources for four threads. two threads are bound to a single picoprocessor, allowing for concurrent execution of two threads within a dppu. the coprocessors are designed to support all four threads. in general, the computational resources (alu, hardware-assist state machines, and so on) are shared among the threads. the private resources provided by each picoprocessor or coprocessor include the register set that makes up its architecture (gprs, program counters, link stacks, datapool, and so on). tlir tree leaf insert rope tlv type length vectors tms table management services tp target port tsdqfl tree search dequeue free list ts transport services tse treesearchengine tsenqfl treesearchenqueuefreelist tsfl_id tree search free list identifier tsr tree search result ttl time to live uc unicast udp user datagram protocol wfq weighted fair queueing word 4 bytes zbt zero bus turnaround term definition
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 549 of 554 revision log rev description of modification 11/11/99 initial release of databook for ibm32npr161epxcac133 (revision 00), follow-up document to datasheet for ibmnpr100exxcab133 and ibm32npr161epxcac133 (revision 01, 11/08/99) 02/28/00 release ibm32npr161epxcac133 (revision 01), including revised sections 1 and 5. 03/17/00 release ibm32npr161epxcac133 (revision 02), including revised sections 2, 8, 10, 12, and 13 03/31/00 release ibm32npr161epxcac133 (version 03), including revisions to all sections. 05/26/00 release ibm32npr161epxcac133 (revision 04), including revised sections 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14. in section 2: - redefinition of some signal pins - full description of thermal monitor in section 6: - descriptions of the discard port queue (dpq) and the discard queue stack (gdq) - an updated table about valid combinations of scheduler parameters - a table about configuring a flow qcb in section 8: - updated cs address map - updated various registers and operand descriptions in section 14: - a capacitance table was added. - thermal information was updated. 07/13/00 updated revision log for completeness global changes: ibm network processor changed to either ibm power network processor or np4gs3 cross-references updated wording for readability updated typographical errors corrected section 1 - overview: p. 1 features updated p. 2 ordering info table streamlined section 2 - physical description p. 9 figure 4 updated p. 11 table 1 updated (one more riscwatch/jtag added; added note) p. 21 table 12 updated p. 22 table 14 updated pp. 32-3 sections 2.1.2, 2.1.3.1, 2.1.4.2 added p. 56 table 24 updated section 3 - pmm p. 63 table 26 updated p. 68 sections 3.2.1 and 3.2.2.1 added p. 68-71 tables 29 and 30 updated section 4 - ieds p. 80 description of round-robin scheduling rewritten section 5 - switch p. 85 text updated (re: dasl link throughput) pp. 95-6 added more specific sequence info under oqg reporting
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 550 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 07/13/00 (cont.) section 6 - eeds overall gpq changed to ppg p. 1 items added to overview p. 100 definitions of dpq and gdq expanded p. 109 table 42 updated p. 113 table 43 updated section 7 - epc p. 122 table 46 updated p. 125 text updated (# of available commands) p. 133 table 53 updated p. 138 twinlease command changed to leasetwin p. 142 figure 40 updated p. 143 table 71 updated p. 145 figure 41 updated p. 146 table 72 updated p. 148-50tables 73, 74, 75, and 78 updated p. 153 note added to table 84 p. 159 table 98 expanded p. 160-1 tables 99, 100, 101, and 102 updated p. 172 dll termination text expanded p. 177 external counter item expanded section 8 - tree p. 188 table 123 updated p. 188-9 section 8.1.2 added p. 190 descriptive text (re: figure 48) added p. 192-229tables 125, 135, 141, 143, 147-162, 164-165, 172, 174-176, 185, 187, and 189 updated section 10 - eppc p. 246 dcr base addresses updated p. 250 plb addresses updated p. 255 sec 10.7.3 register updated p. 256 sec. 10.7.4 register updated p. 259 sec. 10.7.7 register updated p. 260 sec. 10.7.8 register updated section 12 - debug p. 313-4 sec. 12.4 (port mirroring) updated; previous sec 12.4 (probe) deleted section 13 -config p. 316 sch_ena definition updated p. 319 d0_width and dram_size definitions updated p. 326 bit 18 now reserved p. 336 bl_override definition updated p. 338 sec. 13.13.6 changed from enable to disable p. 342 text updated; most significant bit is now 0 p. 343 text updated p. 360 intro text added p. 365 sec. 13.20 gq_max register changed to fcb_fq_max register section 14 - electrical and thermal specs p. 381 table 208 added (replaces old capacitance table) p. 402 table 210 updated section 15 - glossary p. 408 ecc added to list revision log rev description of modification
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 551 of 554 07/28/00 release ibm32npr161epxcac133 (revision 05). all sections revised for: typographical accuracy, syntactical clarity, standardized appearance of field and signal names, updated or additional cross-references for navigation. substantive changes include: preface - p. xix corrected url for ppc405gp section 2 - physical description p. 35-54 tables 22 & 23 updated. signal pin jtag_tdo moved from n22 to aa23. n22 is now unused. section 6 - eeds p. 105 table 40 updated. p. 111 qd description updated. section 7 - epc p. 125 7.2.3 text - configuration quadword array item added p. 154-7 7.2.6.2 - checksum coprocessor commands clarified p. 169 table 112 - gt field comparison deleted section 8 - tree search engine p. 191 8.1.2 text - expanded dt entry item p. 195 8.1.5 text updated p. 197-8 8.2.1 text updated p. 206-8 table 142 updated p. 226 tables 184 and 186 updated section 11 - reset p. 299 intro text added p. 300-1 11.2 step 1 text and table 202 updated p. 306 11.10 step 9 text updated section 14 - electrical & thermal p. 385-405table 212 updated. signal pin jtag_tdo moved from n22 to aa23. n22 is now unused. 09/25/00 release ibm32npr161epxcac133 (revision 06). all sections revised for: typographical accuracy and syntactical clarity. page numbering of front material changed to consecutive arabic numerals. substantive changes include: section 2 - physical description p. 32-70 changed signal types on tables 2, 3, 7-9, 11, 13, 17, 19, 21, 23, 25, 27, 29. added tables 6, 12, 18, 20, 22, 24, 26, 28, 30, 31 added figures 5- 8, 11-16, 20 p. 44-5 tables 14 and 16 - altered field names p. 50-1 table 19 - redefined cpdetect field p. 59-60 table 25 - added note section 4 - ieds p. 117 4.3.3.2 - clarified definition of t field section 6 - eeds p. 141-3 6.3.3 - added res bus section p. 143 6.3.4.2 - expanded third bullet text p. 148 6.4.2.2 - clarified th explanation all changed ppq to gpq, gdq to e-gdq section 7 - epc p. 171-6 added tables 64, 66, 68, 70, 75, 77 p. 191-7 added tables 102, 114 p. 216-7 table 137 - clarified polcb field definitions section 8 - tree search engine p. 246 table 159 - redefined leafth field 8.2.5.2 - clarified text table 160 - redefined threshold field p. 254-5 tables 175, 177 - clarified ludefindex field revision log rev description of modification
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 552 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 09/25/00 (cont.) section 10 - eppc p. 325 10.8.22 - corrected cab address section 11 - reset p. 338 11.2 - added text p. 339 11.3 - added text section 13 - configuration p. 372 13.11.1 - updated text p. 397 13.14.5 - redefined fields section 14 - electrical & thermal p. 423 updated table 228 p. 444-9 added or updated tables 231-47 march 2001 IBM32NPR161EPXCAD133 (rel. 1.1) & ibm32npr161epxcae133 (rel 2.0) (revision 07). changed ? databook ? to ? datasheet ? throughout. updated figure and table caption formats. substantive changes include: section 1 - overview section 2 - phy updated all timing diagrams and related tables added timing information for np4gs3a (r1.1) and np4gs3b (r2.0) to all timing tables table 32 (mechanical specifications) - removed footer and header shading added definition to d4 data strobes signal in d4_0 / d4_1 memory pins table added definitions to ds0 and ds1 data strobes signals in ds0 and ds1 pins table reworded pci definitions in type column of pci interface pins table changed pll values in clock generation and distribution figure section 3 - pmm figure 29: changed txaddr to rxaddr and added footnote 1 figure 32: added txaddr row and added footnote 1 section 4 - ingress-eds figure 49 - added set of sof rings section 5 - switch interface table 50 - revised definitions for qualifier, st, qt(1:0), and endptr revision log rev description of modification
ibm powernp np4gs3 preliminary network processor np3_dl_sec15_glos.fm.08 may 18, 2001 glossary of terms and abbreviations page 553 of 554 march 2001 (continued) section 7 - epc 7.2.2/3 - added section on clp commands and opcodes 7.8 - added section on semaphore coprocessor section 8 - tree search engine 8.2.3.4 - added paragraph on caches in ludeftable table 173 - updated ludeftable entry definitions 8.2.6 - updated bit definitions for ludefcopy_gdh definition 8.2.7 - added new introductory paragraph 8.2.7.9 - 8.2.7.12 - added distpos_gdh, rdpscb_gdh, wrpscb_gdh, and setpatbit_gdh sections section 11 - reset and initialization tables 239 and 242 - added information about external physical devices to setup 2 and configure checklists section 12 - debug 12.5 - added further explanation to step 1 of set up port mirroring section 13 - configuration 13.1.1 - redefined fields 13.1.2 - added new base offset and redefined fields 13.6 - updated text 13.11.1 - updated text 13.14.5 - redefined fields, added base offsets 0 and 1 13.30.2 - added section on dasl interface section 14 - electrical and thermal specifications added new input capacitance table information updated thermal and voltage table information 14.3.1 - added dasl interface specification tables section 15 - glossary added definition for even parity added definition for picosecond added definition for rope added definition for tbi structural and stylistic edits preface rewrote ? who should read this manual ? section added three urls updated conventions section 1 - general information entire chapter rewritten & reorganized for clarity section 2 - physical description 2.1 - rewrote introduction and created navigation table rearranged tables in section added subsections with introductory paragraphs to match navigation table section 3 - physical mac multiplexer changed section title from ? pmm overview ? minor edits and rewordings throughout 3.4.1 - heavy edit 3.5.1 - heavy edit revision log rev description of modification
ibm powernp np4gs3 network processor preliminary glossary of terms and abbreviations page 554 of 554 np3_dl_sec15_glos.fm.08 may 18, 2001 march 2001 (continued) section 4 - ingress eds rewrote most paragraphs to change text to active voice section 5 - switch interface minor edits throughout switched sections 5.4.1 ? output queue grant reporting ? and 5.4.2 ? switch fabric to network processor egress idle cell. ? moved tables in those sections. section 6 - egress eds switched positions of table 6-3 and figure 6-7; adjusted text accordingly minor edits throughout section 7 - epc moderate edits throughout moved tree search engine summary from overview into dppu section moved dppu coprocessors and coprocessor opcodes out of ? dppu ? and set them up as their own level 2 sections. combined each ? coprocessor commands ? list with the corresponding ? coprocessor summary ? table section 9 - serial / parallel manager moderate edits throughout section 13 - configuration renamed section (from ? np4gs3 configuration ? ) sections 8, 10, 11, 12, 14, and 15 moderate edits may 2001 section 1 - overview minor revisions to sections 1.3 and 1.5 section 2 - phy timing information for thed6 ddr has been split out from the other ddrs. minor changes to timing table information throughout section 3 - pmm 3.2.1 pos timing diagrams updated tables 3-6 and 3-7 updated section 4 - ieds 4.2 added description of tb_mode register control of multiple in-flight i-eds frames. section 5 - switch interface table 5.1 requirements for frame packing have been corrected. section 7 - epc table 7-47 ingress target queue fcbpage parameters, parameter definition note for pib added. section 8 - tse added section 8.1.3, logical memory views of d6. section 10 - powerpc 10.4 - pci/plb macro. added a second overview paragraph explaining configuration options. section 13 - configuration 13.21 - frame control block fq size register definition expanded 13.31.1 - dasl configuration register bits 31:29 redefined section 14 - electrical and thermal updated tables 14-3 and 14-4. revision log rev description of modification

▲Up To Search▲

Price & Availability of IBM32NPR161EPXCAD133

	To Download IBM32NPR161EPXCAD133 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .