1989 final year project

steveloughran · Sep 1, 2014 · e53813b · e53813b
1 parent 5476692
commit e53813b
Show file tree

Hide file tree

Showing 48 changed files with 7,171 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -10,3 +10,7 @@
 
 # virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml
 hs_err_pid*
+
+# Tex artifacts
+*.dvi
+*.aux
diff --git a/papers/urisc/README.md b/papers/urisc/README.md
@@ -0,0 +1,3 @@
+# the Ultimate RISC
+
+This is my final year undergraduate paper from 1989; the formal specification and implementation of a microprocessor which implemented one instruction, MOVE. The specification was done in a version of Standard ML which allowed for the specification of temporal logic, yet allowed pure ML to be used, ML which could then be interpreted. The ALU was so defined, then translated to FPGA form. The rest of the system was built from 74-series ICs and hand-wired.
diff --git a/papers/urisc/abstract.tex b/papers/urisc/abstract.tex
@@ -0,0 +1,16 @@
+%abstract 28/4/89 sal 
+% %W%
+\begin{abstract}
+Computer Design is a highly competitive field, where there is much  interest in the 
+possibility of designing high performance computers quickly, cheaply and reliably.
+The cost and performance requirements may be satisfied by RISC architectures, and the application
+of Formal Methods to hardware design promises new levels of quality.
+This report is a description of a final year project, the design, formal specification and implementation of the Ultimate RISC, a single instruction computer. 
+
+The project was a demonstration that such a computer is simple enough to be  designed and built within a single year, although
+the implementation does suffer from some limitations.
+
+The concept of a single instruction computer is  discussed in general,
+concluding that whilst it is a fast and compact design which could be useful in some applications the memory bandwidth it requires  limits its performance. 
+
+\end{abstract}
diff --git a/papers/urisc/alu.tex b/papers/urisc/alu.tex
@@ -0,0 +1,114 @@
+\chapter{The ALU}
+
+\section{Design}
+For a computer to be capable of effective operation, it needs the ability to perform
+ actual processing of data. 
+It has long been known that the ability to compare two items and act upon the result is sufficient for effective computation 
+---Turing Machines are based around this concept.
+It would therefore have been possible to build a basic  comparison unit, and rely
+on software  to derive mathematical and logical operations.
+This would have been unreasonably  inefficient.
+All realistic computers   have hardware dedicated to evaluation of these functions.
+These Arithmetic and Logic Units ({\bf ALU}) normally perform at least integer addition, subtraction and the standard boolean functions of two variables.
+More powerful units are capable of high speed multiplication, or even manipulate
+floating point numbers.
+
+
+At the start of the project I was offered the possibility of using a 
+single chip 64-bit floating point ALU from AMD (AM29C327) \cite{amd:uprogramming,amd:29c300}.
+This would have produced impressive performance figures, but I 
+decided that it would have been unworkable, since it was 
+designed for a triple data bus and needed 31 bits of control 
+information every cycle. A single bus system would have  been unable to use this 
+device effectively. 
+
+Instead I designed a very simple ALU, since this made 
+formal specification  possible. The unit was  
+built from eight bit sliced TTL ICs, each of which  operates on four bits.
+ When connected together via a two level 
+ carry lookahead generator, they  perform   operations on 32-bit words.
+
+This 
+is sufficient for many purposes, except that the ability to shift 
+a word  right was needed in iterative multiplication and division algorithms.
+
+ The result of the ALU had to be stored until  re-used in later instructions.
+The state of this result, 
+whether zero or  negative  needed to obtained in a form
+ which could be passed to the Skip register.
+ Arithmetic overflow and carry flags were also desirable,
+   detecting results too large to be represented in 32 bits.
+
+\section{Implementation}
+
+The design of the ALU is shown in figure~\ref{figure:alu}.
+
+An Accumulator stores the output of the ALU between operations.
+This can be read as a memory location.
+The contents of the Accumulator are also used as one of the inputs to the ALU,
+so only one other argument needs to be supplied per operation.
+This accumulator is built out of four SSRS, so can be read directly by the host.
+
+A number of bit sliced ALUs were available with built in accumulator registers. 
+For example,
+the AMD AM2901 (\cite{amd:logic}) or the TTL 74F681 ALU bit slices,
+ would have provided enhanced performance with less components and wiring.
+Using these would have prevented the host examining the Accumulator directly.
+Instead I used 74F381 ALU/function generators in my design.
+These  only perform basic operations ---addition, subtraction, and, or, exclusive or, preset and clear. 
+Three control signals  are used to select a function.
+
+
+Between the outputs of the ALU ICs and the Accumulator is a bank of five PALS.
+Normally these pass the result straight through, each PAL checking if the bits passed though it are all zero or not.
+They can also be instructed to shift the result ---including the carry flag--- one bit to the right; 
+this shifting is controlled by a  one bit signal.
+This post shifting  allows a normal operation to be combined with a shift, to make unusual functions such as `subtract and divide by two'.
+
+The results of the five 
+zero tests along with other signals are fed to another PAL, which 
+produces values for a Condition Code register ({\bf CC}),  constructed from a Shadow Serial Register. 
+The PAL generates a zero flag  when all five slices of the result are zero.
+
+\begin{figure}
+\vspace{20cm}
+\caption{The ALU}
+\label{figure:alu}
+\end{figure}
+
+\subsubsection{Overflow}
+
+An arithmetic overflow is where  a signed number's sign changes due to too large an addition, subtraction or shift.
+
+My design of an ALU does not detect signed overflow, despite the original intent to do so.
+I had originally
+acquired equations  from my CS3 notes to detect  overflows using a PAL.
+While specifying the system  I realised
+these equations  only detected overflow on signed addition. 
+To detect overflow in a multi-function ALU, one must compare the carry between
+bit 30 and bit 31 of the result with the most significant bit, an overflow occuring if the two differ.
+This can not be done with the 74F381 bit-sliced devices, as this carry is internal.
+I have discovered that AMD make a special most-significant-slice version of this bit-sliced ALU which does detect overflows internally. 
+The result of this check would however become confused if shifting was performed after the operation, so would not always be reliable.
+
+ Note that even if the ALU did produce an overflow flag, the software would still have to check it after every operation. A number of
+implementations of languages do not do this because of the overhead this entails; 
+APM Pascal and Standard ML are two such  implementations.
+
+\subsubsection{Memory Interface}
+
+Seventeen addresses are allocated to the ALU, as shown in table~\ref{table:memory}.
+One of these addresses returns the current value of the Accumulator whenever it is read.
+The remaining sixteen addresses all apply a different function between the accumulator and the word moved to the selected address.
+This is accomplished by wiring  address bus lines directly to the ALU  and the PALS.
+
+It is not be possible to directly load the 
+accumulator, but a two instruction sequence  clears it and then adds a 
+number to the now empty accumulator.
+
+Condition code flag manipulation is  supported:
+reading any of the sixteen function addresses returns one of the condition code flags in the least significant bit. 
+These results can be passed directly to the Skip register for conditional branching.
+Before performing a subtraction the carry flag has to be set to true, while
+for other operations the flag has to be cleared.
+An address is provided to enable this; when it is written to, the least significant bit is passed to the carry flag.
diff --git a/papers/urisc/aluL.tex b/papers/urisc/aluL.tex
@@ -0,0 +1,54 @@
+The components can be combined to specify exactly how the ALU should behave.
+This could, if desired, be verified against the mathematics of of a subset of integers.
+\begin{verbatim}
+(*              ALU.L
+                =====
+
+The definition of the ALU as a whole.
+
+3/1/89 sal: TTL part only; not the PALS
+*)
+
+(* The carry look-ahead unit *)
+(* built out of three TTL devices *)
+
+val carrylookahead #( g0,g1,g2,g3,g4,g5,g6,g7,
+                        p0,p1,p2,p3,p4,p5,p6,p7,
+                        c,c3,c7,c11,c15,c19,c23,c27,carry_out)=
+
+        SN74F182 #(g0,g1,g2,g3,p0,p1,p2,p3,c,c3,c7,c11,G0,P0)
+        /\
+        SN74F182 #(g4,g5,g6,g7,p4,p5,p6,p7,c15,c19,c23,c27,G1,P1)
+        /\
+        SN74F182 #(G0,G1,true,true,P0,P1,true,true,c,c15,carry_out,
+                        false,true,true);
+
+
+(* the TTL portion of the ALU *)
+(* eight four bit slices connected together via the carry generator *)
+(* a= Input a
+   b= Input b
+   c= carry in
+   s2-s0= control signals
+   c=carry out
+   f=evaluated function
+    *)
+
+
+val ttlALU # (a b c s2 s1 s0 carry_out f)=
+        (a7,a6,a5,a4,a3,a2,a1,a0)==split a /\
+        (b7,b6,b5,b4,b3,b2,b1,b0)==split b /\
+        carrylookahead #( g0,g1,g2,g3,g4,g5,g6,g7,
+                        p0,p1,p2,p3,p4,p5,p6,p7,
+                        c,c3,c7,c11,c15,c19,c23,c27,carry_out) /\
+        SN74F381#(a0,b0,c,p0,g0,s2,s1,s0,f0) /\
+        SN74F381#(a1,b1,c3,p1,g1,s2,s1,s0,f1) /\
+        SN74F381#(a2,b2,c7,p2,g2,s2,s1,s0,f2) /\
+        SN74F381#(a3,b3,c11,p3,g3,s2,s1,s0,f3) /\
+        SN74F381#(a4,b4,c15,p4,g4,s2,s1,s0,f4) /\
+        SN74F381#(a5,b5,c19,p5,g5,s2,s1,s0,f5) /\
+        SN74F381#(a6,b6,c23,p6,g6,s2,s1,s0,f6) /\
+        SN74F381#(a7,b7,c27,p7,g7,s2,s1,s0,f7) /\
+        split f=(f7,f6,f5,f4,f3,f2,f1,f0);
+
+\end{verbatim}
diff --git a/papers/urisc/aluS.tex b/papers/urisc/aluS.tex
@@ -0,0 +1,178 @@
+This describes the operation of the ALU.
+Based upon the lambda specification, the components have been redescribed as
+functions, and combined to describe the entire ALU's operation.
+\begin{verbatim}
+(*	alu.SIM		v1.8 *)
+(*	=======		=====*)
+
+         (* Simulation of the ALU *)
+
+(* the ALU bit slice component*)
+
+          fun SN74F182 g0 g1 g2 g3  p0 p1 p2 p3 c= (* c1 c2 c3 G P *)
+          let
+                val G =(g3  && (p3  ||| g2 )
+                                && (p3  ||| p2  ||| g1 )
+                                && (p3  ||| p2 ||| p1  ||| g0))
+                and  P=p0  ||| p1  ||| p2  ||| p3
+                and  c1= not(g0 && (p1 ||| (~ c)))
+                and  c2=not(g1 && (p0 ||| (g0 && p1 ||| (~ c))))
+                and  c3= not(g2 && (p2 ||| (g1 && p0 ||| 
+					(g0 && p1 ||| (~ c)))))
+          in
+                (c1,c2,c3,G,P)
+	  end;
+
+(* the lookahead carry generator *)
+
+          fun carrylookahead g0 g1 g2 g3 g4 g5 g6 g7
+                           p0 p1 p2 p3 p4 p5 p6 p7 c=
+          let 
+                val (c3,c7,c11,G0,P0)=SN74F182 g0 g1 g2 g3 p0 p1 p2 p3 c
+          and    G1=(g7  && (p7  ||| g6 )
+                                && (p7  ||| p6  ||| g5 )
+                                && (p7  ||| p6 ||| p5  ||| g4))
+          and    P1=p4 ||| p5 ||| p6 ||| p7
+          in
+                let
+                        val (c15,carry_out,_,_,_)=SN74F182 G0 G1 true true
+                                                           P0 P1  true true
+          							c
+                in
+                        let val (c19,c23,c27,_,_) = SN74F182 g4 g5 g6 g7
+                                                             p4  p5  p6  p7
+          							c15
+                        in
+                                (c3,c7,c11,c15,c19,c23,c27,carry_out)
+                        end
+                end
+          end;
+
+
+(* the TTL part of the ALU)
+          fun ttlALU a b c s2 s1 s0=
+          let
+                val (a7,a6,a5,a4,a3,a2,a1,a0)=split a
+          and    (b7,b6,b5,b4,b3,b2,b1,b0)=split b
+          in 
+                let
+                        val (c3,c7,c11,c15,c19,c23,c27,carry_out)=
+                                carrylookahead
+                                (generate a0 b0)
+                                (generate a1 b1)
+                                (generate a2 b2)
+                                (generate a3 b3)
+                                (generate a4 b4)
+                                (generate a5 b5)
+                                (generate a6 b6)
+                                (generate a7 b7)
+                                (propagate a0 a0)
+                                (propagate a1 a1)
+                                (propagate a2 a2)
+                                (propagate a3 a3)
+                                (propagate a4 a4)
+                                (propagate a5 a5)
+                                (propagate a6 a6)
+                                (propagate a7 a7)
+                                c
+                in (carry_out, merge 
+                        (applyALU a7 b7 c27 s2 s1 s0)
+                        (applyALU a6 b6 c23 s2 s1 s0)
+                        (applyALU a5 b5 c19 s2 s1 s0)
+                        (applyALU a4 b4 c15 s2 s1 s0)
+                        (applyALU a3 b3 c11 s2 s1 s0)
+                        (applyALU a2 b2 c7 s2 s1 s0)
+                        (applyALU a1 b1 c3 s2 s1 s0)
+                        (applyALU a0 b0 c s2 s1 s0))
+                end
+          end;
+
+(* the shift pal program for the four least significant PALS *)
+
+          fun ALU_SHIFT_PAL_fn shift f7 (f6,f5,f4,f3,f2,f1,f0)=
+          let val z= (~shift && ~f0 && ~f1 && ~f2 && ~f3 && ~f4
+                                  	&& ~f5 && ~f6 ) |||
+                                (shift && ~f1 && ~f2 && ~f3 && ~f4 && ~f5
+                                  && ~f6 && ~f7 )
+          and   h0= ~shift && f0 ||| shift && f1
+          and   h1= ~shift && f1 ||| shift && f2
+          and   h2= ~shift && f2 ||| shift && f3
+          and   h3= ~shift && f3 ||| shift && f4
+          and   h4= ~shift && f4 ||| shift && f5
+          and   h5= ~shift && f5 ||| shift && f6
+          and   h6= ~shift && f6 ||| shift && f7
+
+          in
+                (z,(h6,h5,h4,h3,h2,h1,h0))
+          end;
+
+(* the shift PAL program for the most significant PAL *)
+
+	  fun ALU_SHIFT_PAL_fn_2  shift d0 c f31 f30 f29 f28
+	   =
+		let val h28= ( ~shift && f28 ||| shift && f29) 
+		and h29 = ~shift && f29 ||| shift && f30
+		and h30 = ~shift && f30 ||| shift && f31
+		and h31 = ~shift && f31 ||| shift && c
+		and carry_out = ~shift && c ||| shift && d0
+		and z = ~shift && ~f28 && ~f29 && ~f30 && ~f31
+			|||
+		       shift && ~f29 && ~f30 && ~f31 && c
+		in
+			(z,carry_out,(h31,h30,h29,h28))
+		end;
+
+
+(* the condition code generation *)
+          fun ALU_CC_PAL_fn shift z0 z1 z2 z3 z4 carry_in data0 addr4=
+          let
+                val z=z0 && z1 && z2 && z3 && z4
+               and c=carry_in && addr4 ||| data0 && ~addr4
+          in
+                (z,c)
+          end;
+
+(* the complete ALU *)
+(* takes the current ALU state, the data on the bus and the 
+value of the address bus to return an updated ALU state*)
+
+          fun alu (alustate:ALUstate) d a=
+          let
+                val (c,f)=ttlALU (get_acc alustate)
+                                 d
+                                 (get_carry alustate)
+                                 (addressBit2 a)
+                                 (addressBit1 a)
+                                 (addressBit0 a)
+          	and shift=addressBit3 a
+          in 
+                let val ((f31,f30,f29,f28),f3,f2,f1,f0)=
+				split7 f in
+                        let val (z0,h0)=ALU_SHIFT_PAL_fn shift (dataBit7 f) f0
+                            and (z1,h1)=ALU_SHIFT_PAL_fn shift (dataBit14 f) f1
+                            and (z2,h2)=ALU_SHIFT_PAL_fn shift (dataBit21 f) f2
+			    and (z3,h3)=ALU_SHIFT_PAL_fn shift (dataBit28 f) f3
+                            and (z4,c2,(h31,h30,h29,h28))=
+				ALU_SHIFT_PAL_fn_2 shift 
+					(dataBit0 f) c f31 f30 f29 f28
+                        in
+                                let val h = merge7 (h31,h30,h29,h28) 
+                                            h3 h2 h1 h0
+                                in
+                                        let val (z,carry)=
+          					ALU_CC_PAL_fn shift
+                                                z0 z1 z2 z3 z4 c
+						(dataBit31 d)
+						(addressBit4 a)
+                                        in
+                                          ({acc=h,
+                                           z=z,
+                                           n=dataBit31 h,
+                                           v=carry,
+                                           carry=carry}:ALUstate)
+                                        end
+                                end
+                        end
+                end
+          end;
+\end{verbatim}