PDP-11‎ > ‎

pdptrbsh

Trouble shooting the Boot Process

Modifications:
  05/06/2003 correct some html syntax errors
  08/29/2016 correct PSW:Priority bitmap
  06/22/2017 BOOTSTRAP section was missing, correct and add 2nd link

Contents:

ODT - Octal Debugging Technique Commands

This is a built in debugging/memory examination tool. I'm just beginning to understand its power as I learn more. I used to dread the sequence of seeing my program crash followed by an address and the ODT promt, '@'. To be honest I still dread this, but no longer think the '@' is my enemy. Its my friend, the offending program or hardware that caused the fault is my enemy!

ODT is version specific! From what I read these console versions of ODT are subsets of ODT-11 supplied as a linkable object module with Macro-11. The version ROMed into some of the early PDP (11/03?) supported additional commands such as 'M', a register dump for maintenance mode, and 'L' to use absolute loader format to load a program through console SLU. The version described below is what runs on an 11/23, it should work on all LSI 11 systems as it is a minimal set of commands.

ODT runs on the console SLU. The address one uses for this depends on the available address space, but it can not be adjusted as the address is burned into the ROMs (offset 7560 into iopage).

		Address space size
Register   16 bits    18 bits  22 bits
RCSR       177560     777560   17777560
XCSR       = RCSR + 4 (for all)
The CSR registers are polled, not interrupt driven so don't write ahead or data may be lost. Conversely ODT assumes the data transmitted is buffered (although I believe it supports XON/XOFF). This is particularly important for an application like my PC interface described below in another section.

The following list describes the base level ODT commands

  • '/' - slash: prints the contents of a specified location
  • &#60CR&#62 - Return: Closes an open location
  • &#60LF&#62 - Closes an open location and opens the next
  • '$' or 'R' - Opens specified interal register (0-7)
  • 'S' - Opens PS if follows a '$' or 'R' character
  • 'G' - starts execution of a program/code block
  • 'P' - Resumes execution which was halted
  • &#60CTL&#62&#60Shift&#62'S' - binary dump ("manufacturing use only")
  • 'H' - reserved for Digital Use!
If you've never done DEC ODT this probably needs a little more help, but testing by example will be instructive. The LSI-11 has several registers (which I believe were memory mapped to the IO page on the unibus). Registers 0-7, R0-R7, are the general purpose registers. R6 is used as the stack pointer, SP, and R7 is the program counter, PC. The Processor Status Word, PSW, accessed via 'S' affects various operating system behavior. All addresses and values are in octal (which always confuses me, I think in hex!).

One of the main things one does with ODT is examine/adjust the contents of memory locations. To examine memory one enters an address followed by '/' and ODT responds with the contents of the address. At this point one may change the current contents:
enter an octal number as a new value for location and one of following
&#60cr&#62 closes the location, returns '@' prompt waiting for new address
&#60lf&#62 closes current location, and opens next displaying its address and contents. Useful for entering or examining consecutive lines of code. If you just wanted to examine the contents, don't enter a new value, instead just press &#60cr&#62 or &#60lf&#62 to close the location or advance to the next.

Note that one can step through successive memory or register locations with &#60lf&#62 (use &#60CTL&#62'j' on a PC keyboard). In the case of resgisters, this is a modulo 7 operation, ie it wraps to R0 after R7.

The go command, 'G', starts execution clearing PS and FPS. It is often proceeded by an address which will be loaded into the PC. If none is specified, 0 is used. Ie on many systems @173000G starts the boot sequence.

The Proceed command, 'P', starts execution where it was last halted. PS,FSP,and PC are not changed. However after a halt you could have modified these before resuming operation.

The binary dump, &#60CTL&#62's', is documented for the LSI-11 systems but then it is suggested that this is only there for DEC manufacturing purposes. Give me a break, if one is trying to automate console ODT via a host computer this is the way to go. One sends ODT the command character, ascii 0x13 (octal 23), and two bytes which are a WORD address. Send msb first followed by lsb, these are not echoed. ODT responds with 10 bytes of binary data representing the contents of the five words of memory starting at the address specified. Technically only the first 32K words of memory are available this way, but one suspects using the MMR (Memory Mapping Register) one can access all that is available.

It appears that the ODT display gives one a hint as to the system on it is running on. Namely my 11/73 (M8012) and 11/23 Plus (M8189) respond to &#60CTL&#62'j' with 8 octal digits required to specify a 22 bit address:
01777560
Conversely my 11/23 (M8186) resonds with the 6 octal digits required to specify either a 16 or 18 bit address:
017560 or 077560
If an address which is out of range is specified, ODT responds with at '?'. You get this for unrecognized menmonics and invalid addresses.

Using ODT to trouble shoot the System

The "Microcomputers and Memories" suggests a few trial debugging programs. It assumes you can at least get to ODT. The following serve as ODT examples and reasonable places to start your trouble shooting. Below is the simplest program in the world, it does nothing except establish that the processor can run (if you are in ODT likely that's the case). Text from ';' below on will not be input, it is a comment, typically the macro-11 symbol for the instruction.

 001000/240&#60lf&#62  ;NOP
 001002/240&#60lf&#62  ;NOP
 001004/240&#60lf&#62  ;NOP
 001006/137&#60lf&#62  ;jmp #1000, ie loop forever
 001010/1000&#60cr&#62 ; note &#60cr&#62 closes location so can enter register values
 R6/1000&#60cr&#62     ; set stact to start below code at 1000
 R7/1000&#60cr&#62     ;set PC to start address so we can use P and not clr PS
 RS/340&#60cr&#62      ;set PS to disable interrupts
 P&#60cr&#62           ;typing this starts program
 After entering P above the processor should run till you halt it.

I know virtually nothing about the line clock. Guess it has one readable bit and generates BEVNT interrupt service requests. The following program should run for at most 1/50 of a second and then halt displaying the address 106.

;set up interrupt service
 000100/104&#60lf&#62  ;BEVNT service address
 000102/340&#60lf&#62  ;
 000104/0&#60cr&#62    ;BEVNT service routine does a halt
;mini loop "forever" program code waiting for BEVNT
 001000/137&#60lf&#62  ;jmp #1000, ie loop forever
 001002/1000&#60cr&#62
 R6/1000&#60cr&#62     ;setup stack
 01000G          ; typing this starts program

A slightly more complex interrupt service routine to test the console SLU. We know it works in polled mode cause that's how ODT does its thing, but this test interrupt service capability.

 ;setup interrupts
 060/002000&#60lf&#62  ;address recieve service routine
 062/000340&#60cr&#62  
 100/000102&#60lf&#62  ; set LTC interrupt to just return
 102/000002&#60cr&#62

 ;main program
 01000/012706&#60lf> mov #1000,sp   set up stack
 01002/1000  &#60lf&#62     
 01004/106427&#60lf> MTPS 340, raise priority while diddle CSR
 01006/0340&#60lf&#62
 01010/012737&#60lf> mov #100,177560 enabling interrupts for Rec
 01012/0100
 01014/177560&#60lf&#62
 01016/106427&#60lf> MTPS 0, lower priority again
 01020/0&#60lf&#62
 01022/000137&#60lf> loop forever waiting for interrupts
 01024/001022&#60cr&#62
 
  ;interrupt service
 02000/010537&#60lf> TSTB @ 177564
 02002/177564&#60lf&#62
 02004/100375&#60lf> BPL -4, wait for xmit clear so can send it
 02006/013737&#60lf> mov input buf to output buf
 02010/177562&#60lf&#62
 02012/177566&#60lf&#62
 02014/000002&#60cr> RTI, return from interrupt

 1000g           ; runs program, break should terminate (why??)

Bootstrap routines

You can toggle a bootstrap into your PDP-11 using ODT. Its especially useful if your normal system device is disabled and you need an alternate boot path. My personal favorite in this situation is to boot a TU58 device over a serial line connected to a remote computer running a TU58 emulator. Almost any supported OS can be booted this way including XXDP. See my TU58 Emulation page. The Dec convention is that the 512 byte boot block, logical block 0, on the device is loaded to PC = 0 in PDP-11 memory, then control is transfered to PC 0 to complete the boot process of whatever OS is contained on the device. Since a google search for 'Dec PDP-11 Bootstrap' turns up a lot of hits I will not duplicate their contents here. I like the suggestions in http://www.psych.usyd.edu.au/pdp-11/bootstraps.html as it includes my favorite device bootstrap, DECtape aka TU58, as well as helpful hints and several other bootstraps of potential interest. One finds lots of information at 'PDP-11 Freeware Archives', but I also recommend their boostrap directory.

Micronote 15 is of interest as it describes the bootstrap code available in the ROM chips included on various PDP-11 cards to boot different devices Qbus Hardware Bootstraps. Note most of the ROM chips could be replaced, even if you have a standard board its ROM may be non-standard. There were a number alternative PDP-11 disk controller vendors back in the day, and they different boot sequences. If you have a PDP-11 with a custom boot ROM dumping its contents and publishing them might some day be helpful to the PDP-11 community. I am aware of 3 such controls:

PDP addresses you might want to know about

This is a pretty quick summary to refresh my memory and give just enough information to make one dangerous. To really understand this I suggest you get some of the DEC reference material. I attempt to summarize some of the key concepts below. Hope I don't confuse the issue too much!

Your machine came with some amount of physical memory, and if it passed the memory test this amount may have been displayed. However there is another region called the IO page which contains all the hardware registers. The bus address (octal) for the IO page depends on the addressing your chip supports, it is always located in the top 4K words (8K bytes) of your systems address space:
io page for 16 bit address: 160000 - 177777
io page for 18 bit address: 760000 - 777777
io page for 22 bit address: 17760000 - 17777777
In the discussion below I use offsets into the iopage from the addresses above.
I believe all 11/23 systems supported a memory management unit, MMU, to allow mapping of the virtual 16 bit word addresses to physical addresses. I also believe that these systems power up with the 16 bit virtual IO page mapped to the physical IO page range for your system. You can verify this in ODT. In most of the following discussion I will use 16 bit addresses, if you'd rather use physical addresses you have to add in the appropriate offset.

The processor status word, PSW, is at the top word of the iopage at offset 17776. This is important for programming the system, see the bitmap below:

  bit 3:0   - Contition codes NZVC (in decreasing bit order)
  bit 4     - Trace bit, used in debugging to force single stepping
  bit 7:5   - Priority level (all set inhibits all interrupts)
              This bits control the interrupt level allowed:
              7 - none, 6 - level 7, 6 - level 7 & 6
              4 - 7,6, & 5, 0-3 - 7,6,5, & 4
  bit 8     - Suspend instruction "reserved for future instruction sets"
  bit 11:9  - reserved
  bit 13:12 - previous memory management mode, PM
  bit 15:14 - current memory management mode, CM

The memory management bits 15:12 are available on all systems, but one assumes the are disfunction if there is no MMU. See MMU discussion below. The sample interrupt program above uses the MTPS instruction to set the priority level to 7 while the interrupt table is adjusted and then set it to zero to allow all interrupt levels. I believe one can read and write all bits when the system is powered up.

Although a macro programming coarse is beyond the scope of this note, I mention two key instructions (mostly to refresh my memory).
RESET = 000005 Causes bus signal BINIT L to be asserted in a sequence that initializes any io devices on the bus. It also clears memory managment status registers SR0-SR3.
HALT = 000000 If operating in user mode, execution is stop and the system enters ODT. When memory management is present and system is in kernel mode a trap to location 10 occurs (this allows kernel protection for multi-tasking). My book also says that if jumper W7 on KD11F is removed this trap will occur independant of MMU mode.
The following sequence should initialize the system and return to ODT, coarse you could also toggle the RESET switch on the front pannel while the HALT switch is down (disable program control).
1000 000005; RESET
1002 000000; HALT
now run it with "@1000G"

My reference book says if you are working on a Unibus system the general purpose registers are memory mapped into offsets 7700-7707 of the iopage. My Unibus system died, so I can't verify this, but it indicates only one byte per register is mapped into this space?? Immediately above them are the CPU registers at iopage offsets 7710-7716. The Qbus (LSI-11) does not have this region, and I assume one can detect a Unibus by its presence.

Many of the PDP-11's support a memory management unit, MMU. This allows mapping the "virual" word address from use's program to a physical address that is outside the 64k region one can access directly with a word address. The offsets into the iopage reserved for this region are listed in the table below.

CAUTION: I am just learning about this, may have something wrong. Let me know if I do! I'm just coping info from a manual at this point, and where possible verifing on the systems I have.

Addendum: After writing most of this I found it has already been done in Micronote 8: LSI-11/73 Memory Managment. Although article above is about the 11/73 most of the concepts are the same. The 11/73 apparently has more registers to handle the addition of a supervisor mode and the additional PAR/PDR pairs for splitting up code and data spaces, but otherwise handles this in the same way. Good reading!

The MMU performs this feat through the use of two sets each of which is made up of eight 32 bit active page registers, APR. Each APR is actually two words, a page address register, PAR, and a page description register, PDR. There are two modes of operation for the MMU, kernel mode (super user) CM=0, and user mode CM=3. These modes are controlled by the two CM bits in the PSW described above. You power up in kernel mode, a multi-tasking environment will set up its environment and run user programs in user mode. The memory mapping will be such that the user can't reach the PSW directly in memory (its not protected against writes). In user mode the HALT instruction (see above) is inhibited from halting as this would give the user access to the PSW, instead it traps to location 10. Similarly in user mode RESET is treated as a NOP and MTPS can only write to bits 3:0. Each mode has its own stack pointer and set of eight APR registers, the ones in use are determined by the PSW CM bits.

          MMU IOpage Usage:
iopage 		
offset
17572	SR0   Info essential for MMU service per below
    bit 0 - Enable MMU (clear to disable reallocation and protection)
    bit 3:1 - page number associated with fault
    bit 6:5 - mode, CM, associated by fault
    bit 13 - abort-read only (write attempt in read only region)
    bit 14 - abort-page length (outside authorize region)
    bit 15 - Abort non-resident (AFC error or illegal PSW mode)
17574    SR1  read only register, LSI-11 always reads zero (handy!)
17576    SR2  The 16 bit virtual user address is a loaded here, read only
12516    SR3  (only available on some systems)
    bit 4 if MMU enabled (SR0:0) 0 => 18 bit adr, 1 => 22 bit adr
    bit 5 if unibus iomap exists, setting this bit enable, else disabled

	Offsets in IOpage to MMU Active Page Registers (APR)
         Kernel APR      User APR
page #	PAR     PDR	PAR	PDR
0	12340	12300	17640	17600
1	12342	12302	17642	17602
2	12344	12304	17644	17604
3	12346	12306	17646	17606
4	12350	12310	17650	17610
5	12352	12312	17652	17612
6	12354	12314	17654	17614
7	12356	12316	17656	17616

PDP memory is broken up into blocks of 32 words (64 bytes) each. MMU deals with pages which must start on a block boundry, be contiguous blocks, and have a size of 1 to 128 blocks. The system converts the user's virtual address, VA, to a physical address,PA, as follows:

	Virtual Address Mapping
   bits 12:0  - byte displacement within the page above, DF
     DF (displacement field above can be thought as as:
   bits 5:0   - byte displacement in block, DIB
   bits 12:06 - block number in page, BN
   bits 15:13 - active based page #, 0 - 7 
	see table above, ie which APR to use

	PAR
   bits 15:0 - this is the starting block number for physical
	page in memory.  Combine with BN and left shift by
        6, then add DIB to get byte offset in physical memory.
	The MMU would validate that BN doesn't exceed PDR.PLF+1
	that the access per PDR.ACF is acceptable.

	PDR
   bits 2:1 Access Control Field, ACF, defined as:
	    00 Nonresident - causes abort if accessed
	    01 Read-only - abort if written to
	    10 unused, abort if accessed
	    11 read-write allowed
	    
   bit 3 - Expansion direction. 1 -> downward from block 127
   bit 6 - set if page has been written to since PDR or PAR
	was last written to at which point it is cleared
   bits 14:8 - Page length field, PLF, 0-127 ranges implies
	1 to 128 blocks allowed in this page.  Error checked
	against Virtual BN before access is allowed.  Anything
	less than 128 blocks (specification of a page with less
	than 128 blocks produces holes in memory, 8 pages with
        128 blocks per page are required for 64kb).
   bits 15,7,5,4,0 are unimplemented and read as 0

First lets look at serial line units, SLU. I originally mentioned the console device when describing ODT. Its difficult to get very far into trouble shooting without one. Each SLU has a group of common word registers and interrupt service registers. There is a control status register, CSR, and a data buffer, BUF, for the reciever (prefix R) and transmitter (prefix X). Most of the sample code I've seen assume the BUF address immediately follows the CSR, ie BUF = CSR+2. Although its NOT assumed in these same examples I've never seen a case where the XCSR didn't immediately follow the RBUF. The standard console addresses are indicated below:
fuction address  offset from CSR
RCSR    177560		0
RBUF	177562		2
XCSR	177564		4
XBUF	177566		6
Each SLU also has two consecutive interrupt vector, VEC, locations allocated to it. Lots of boards provide this functionality, typically one selects the base CSR and VEC via wire wrapping pins. The console base VEC=60. I'm not going to try to get into how to do the wire wrapping, there are too many boards (you can find some of this in other peoples archives). My advise is if you don't have the documentation, don't try to change addresses on the wire wrap jumpers, instead search the iopage and figure out how your board is configured, it was probably running at a consistent address! If you want to know more about SLU's try the following DL Device Reference, some of these have status bits for modem control lines. I will point you at some ways to detect how your system is currently configured. For more information on how DEC expects things to be configured see Micronote15 for physical ordering and Alan Frisbie's news group post regarding floating address and vector ordering.

My "Microcomputers and Memories" isn't as detailed as Alan's article (Ref 3), but suggest the LSI-11 Qbus ordering rules are a bit simpler. Read his article first, but apparently the Qbus doesn't require inserting an empty address space when one changes device types.

Rank order for LSI-11 floating addresses:
Rank    Device
 4	DUV11  - SLU with modem control (4 registers)
 8      DZV11  - Multiplex serial line unit (6 registers)
10	RLV11  - RL0? extra controllers (??)

Rank order for LSI-11 floating vectors:
 2	DLV11,-F,-J
 7	DRV11-B
 8	DRV11
13	DLV11-E
19	KWV11
20	DUV11
At least one of my systems has the DZV11 at 160010 as Alan mentions. Its got 6 registers which behave quite differently than the DUV11s and DJ11s. This system didn't have any DUV11s. If an empty address slot had been there one could tell just by looking at the address space that the system had a DZV11, as is you have to go around and look at the backplane. Tough!

So now you want to know what is in these registers, this should get you started. Its enough that you can see how the trouble shooting console polling routines work. My manuals define bit ranges with a ':', ie bit15:8 means bits 15 through 8 have the same function. Using this terminology:

RCSR - only bits 6 and 7 are used
    bit 6- read/write a 1 enables reciever interrupts, a 0
	   disables.  Cleared by initialization
    bit 7- read only, reciever done flag.  Set when a character
	   is recieved.  Cleared by reading character from RBUF.
	   One can poll this bit, or enable interrupts to handle io.

RBUF - bits 7:0 are the read only data bits of byte recieved
       bits 15:12 are read only status indicators for this byte
	    cleared when data is read.
    bit 12: parity error
    bit 13- frame error
    bit 14- overrun error
    bit 15- error (set if any of above are set)

XCSR - similar to RCSR but allows one to send a break
    bit 0- set to 1 to put transmitter in space condition.
           If left in this state for more than 1 character
	   time this will be interpruted as a break.
    bit 6- read/write a 1 enables transmitter interrupts, a 0
	   disables.  Cleared by initialization
    bit 7- write only, transmitter ready flag.  Set when a XBUF
	   is ready to accept data.  Cleared by writting character to XBUF.
	   One can poll this bit, or enable interrupts to handle io.

XBUF - bits 7:0 write only data bits to be sent.

The only other serial device I have worked with extensively is the 4 line DZV11 multiplex board. I actually purchased the manual for this long ago and can make the register information available is anyone cares. I even wrote an RT11SJ Fortran xmodem program that polls the registers to do its thing, seems to work fine at 9600 baud.

One can search the IO page using ODT for valid addresses, more detail on the floating device address space is given in the reference above. In all cases below each SLU on the device fills a block of four consecutive registers.

The floating device address space starts at 160010. Assuming the standard configuration, to find DJ11 SLUs one scans up from this address for groups of 4 registers. Stop scanning at the first invalid address found or 163776 which is top of region. Alan Frisbie's floating address space notes say there should be a break between device types, but you can't depend on this.

Scan for DRV11 units starting at 167776 working down toward 167752 for a maximum of 3 SLU's.

Scan for DRV11-B units starting at 172410 and scanning up for a maximum of 3 SLU's. If its a Qbus these will be at consecutive locations. If a Unibus SLU #2 is in range 172430 - 172436 and SLU #3 is in range 172450 - 172456. Caution above is from Appendix of Microcomputers and Memories and contradicts Alan Frisbie's note above which says SLU #3 is assigned to the floating address space.

Scan for a maximum of 31 DLV11 SLU's with modem control capability starting at 175610 and working up through 176476.

Scan for a maximum of 16 DLV11 SLU's without modem control capability starting at 176500 and working up through 176676.

A few of the other device address ranges of interest are:

  • RC11 - RC unibus controller,csr=177440-177456,vec=210
  • RKV11 - RK controller, csr=177400-177416, vec=220
  • RXV?1 - Rx0? 1st controller, csr=177170-177176, vec=264
  • RLV11 - RL02 controller, csr=174400-174406, vec=160

Will's Works PC to ODT interface

I wrote some DOS based utilities to help me debug problems on a PDP system. They aren't much use unless you have an old DOS system lying around, but if you have an XT or AT (not that a Pentium doesn't work) you can use it to automate an examination of your system. It assumes you can at least get to the ODT prompt on the machine in question. If so replace your console with the serial line form the PC and run PDPDB.exe.

See odtdosut.com a selfexpanding archive containing utilities and PDP DOS Utilities Documentation for full information. They facilitate:

  • Uploading the simple troubleshooting and bootstrap programs described above.
  • Dis-assembly of above to verify their function
  • Dumping memory regions on your halted PDP, esp the IO page usage.

References

#1: Microcomputers and Memories, Digital, 1982
EB-20912-20
esp Chapter 7: ODT Microcode, Chapter 16: LSI-11 System TroubleShooting

#2: LSI-11 Systems Service Manual, Digital, 5th Edition 1985
EK-LSIFS-SV-005

#3: Alan Frisbie's news group posting regarding floating address and vector assignent.

#4: Micronote 8: LSI-11/73 Memory Managment.

#5: Micronote15 about physical ordering of devices in backplane

Comments