Mosis Report

REQUEST: REPORT
ID: 61688
P-NAME: ARSE
Fab-ID: T11X-EF
P-P:
REPORT:

ARSE is an instruction scheduler.  It scans eight 8-bit instructions of a simple, generic instruction set and reorders them to improve processing time on a hypothetical processor.  According to our model, a CPU using this instruction set takes one extra latent cycle to process loads and stores if they use the same register, so shifting instruction order is a valid way of speeding up program time as long as the final register values remain the same. In an actual processor, this scheduler could fit between the instruction memory and the cache feeding the CPU so that order optimization would occur in parallel with program execution. For this chip an external memory - Dallas Semiconductor’s DS1304-120 - holds a set of eight instructions that will be reordered. When ARSe is enabled, it reads the instructions and, if it finds a latent instruction, attempts to find an independent instruction that can fill the latent hole.  If one is found, our chip writes the instructions back to memory in a more optimal order.  To perform this operation, ARSe contains a dependency checker, three counters that implement the necessary looping through memory addresses, registers to hold instructions that may change location, adders to change the absolute offset of a moved jump instruction, and a fifteen state PLA to control the whole chip.

We received five chips from the MOSIS fab. One chip had a power to ground short that was probably due to a mask or wafer defect during manufacture. The other four power and grounds were properly isolated.  However, because of confusion in the orientation of the padframe, two of the chips were lost when power and ground were connected backwards.  If the received fabrication instructions did not state the power and ground pins explicitly, perhaps these chips could have been saved. This left two chips that passed the test procedures with flying colors.

Nine tests were performed on the chips to test functionality.  Of the nine, four were most important. The first two tested for the case where no latencies are found and where a latent instruction is followed by no independents.  This tests the dependency checker portion of the chip.  The third case basically tests the whole chip, as it contains a latent instruction and an independent that can be moved. Not only does it check the algorithm, it also checks the interface with memory. The final key test checks a jump move, which tests the cascading adders.  One other test checked the chip’s ability to fix two different latencies, and it passed this complicated case as well.

The main limiting factor for chip speed performance was the reading and writing to external memory.  When a dual non-overlapping clock is used, the maximum frequency at which the chip functions is between 1.7MHz and 4.5MHz.  This value matches well with the expected maximum of 2.08MHz, which is due to the 120ns hold time of the write to memory enable signal.  When the clockA and clockB signals have 50% duty cycles, though, performance degrades at 1.7MHz, while the expected minimum is 4.17MHz. Because our chip was designed with a non-overlapping clock in mind, however, lower performance would be expected.

URL: http://www.owlnet.rice.edu/~morton/vlsi
SUBMITTED BY:
Aamir Virani (
aamir@rice.edu), Robert Morton (morton@rice.edu), and Sara MacAlpine (saramac@rice.edu)

REQUEST: END