A Dual-Port Content Addressable Memory
Feasibility Report

André Mathieu, Brian Magnuson, Andrew Wolan

The following is a listing of changes made to our design since our initial proposal.

Memory Size:

Initially, we were asked to target a CAM size of 1Kbytes. (2KB if you include the tag memory.) The 1KB CAM would consist on 32-32Byte chunks. However, early size estimates showed that this would be too difficult, if not impossible, to create within the constraints of a 10,000 lambda x 10,000 lambda cell. Therefore, we targeted a more feasible design of 128 bytes, which would consist of 4 32-bytes blocks.

We soon noted that due to the complexity and size of the priority encoder and the large size of the tag cell, squeezing in a memory size of 128 bytes was going to tight. Therefore, we brought the design down to just one 32-Byte block.

Go to the main menu and select "layout->overview" for related block diagrams.

Clock:

Our goal was to execute one 32-bit word per clock cycle. The clock rate that we were to use was 50 Mhz. This clock frequency gave us a budget of 20 ns to perform the necessary operations per clock. The basic operations were to perform a match in our tag memory and then output it's corresponding from the data memory. To accomplish this task, we split this task into two separate tasks that would run for half a clock cycle. This planning in turn made us tighten our budget in time from 20 ns/clock to 10ns/task! Though the timing was tight, but we felt that it could be meet.

The reality is, some components took close to or more then the 10ns budget we set on ourselves. Halving the clock to increase the budget from 10ns to 20ns per clock half was not enough. Therefore, we have to bring the clock down on more notch and run our circuit at ¼ the clock rate of the system clock. Decreasing the clock to this level will give 80ns per clock half, which is more then we need. Though, 80ns might be more time then we need, we have little control on how we can divide a clock.

Instructions:

Our original instruction set consisted of the following instructions:
Write, Query/Read, Address Read, and Clear.

Since our memory utilized dual ported SRAM, it only made sense to try to support two of these operations at a time. Trying to support two write operations at a time was tricky, but plausible. However, on a write operation, the user needs to specify the a 7-bit tag, an 8-bit data and 7-bit address (priority.)

These values, including the op code, requires 24 bits of storage. This is more then the 16-bits of storage that would be available per instruction if we were to support the execution of two separate operations per clock. Therefore, we had to change the bit format for a write operation such that only one could be executed at a time.

Also, we replaced the "clear" instruction with an "invalidate" instruction. In short, if we clear the tag and data fields for address n to, say all zeros, then if our next read is looking for a tag of all zeros, it is possible that it might get the results from the located we just cleared. To get around this dilemma, we added a valid bit where the bit is set to '0' on a invalidate.

Components:

Changes to each component, big or small, are noted on their corresponding pages under the "components" menu. The components that had the most significant changes were the decoder, data cell and tag cell.

Task Partitioning:
André Mathieu - Tag and Data Cells.
Brian Magnuson - Priority Encoder, Decoders, Registers and Layout
Andrew Wolan - Write Buffers and Sense Amp