## Chip Genesis



## Creating Chip Generators for Efficient Computing at Low NRE Design Costs

Ofer Shacham (Stanford University / Chip Genesis Inc.) Professor Mark Horowitz's VLSI Group At Stanford

High Performance Embedded Computing (HPEC) 2011

September 22, 2011



- Explain what a 'Chip Generator' design methodology is
  - Why it is needed
  - Why it solves many problems
- Get you all to download and experiment with Genesis2 a prototype tool for creating chip generators
  - It's a "generator for generators"
  - http://genesis2.stanford.edu/



## **Disruptive Forces Changing The IC Industry**

- Power is today's greatest design constraint
  - Power no longer scales with technology
  - Hand-held/embedded devices drive the market
  - At a fixed power budget: more ops/sec requires less energy/op

### Complexity is growing

- Number of devices/chip still scaling
- Algorithmic complexity grows
- NRE costs are out of control



Fewer and fewer applications have markets big enough to justify the effort

#### shacham@stanford.edu

The need for customization has never been greater



## The Biggest Issue



Our systems are so complex that they are very hard to reason about

- Building complex systems is... complicated. And we need to face that fact!
- We need not to encode the system, we need to encode the reasoning



**September 22, 2011** 



## A Wise Man Once Said...

### Platform-based Design – Alberto Sangiovanni Vincentelli, 2001

Integrated circuits ... will most likely be developed as an instance of a particular (micro-) **architecture platform.** ...they will be derived from a *specific "family" of micro-architectures*, possibly oriented toward a particular class of problems, that can be modified... by the system developer. (pg 6)

An architecture platform instance is derived from an architecture platform by choosing a set of components from the ... library and/or by setting parameters of re-configurable components of the library. (pg 7)





## Industry Went Down Two Roads

## System-on-Chip

 'Random' configuration of predefined, pre-verified, and fixed in Verilog cement, IP blocks



## **Re-configurable System**

Fixed configuration of run-time flexible IP blocks



## Why Reconfiguration Is Not Efficient?

- A reconfigurable system can be modified / tailored to specific application
  - At runtime (i.e., via software)
  - Using the same set of resources
- For example, we created *Stanford's Smart Memories* 
  - (Probably) the most reconfigurable CMP ever created
- Fixed architecture w/ mushy (programmable) blocks
  - Does help / amortize the system complexity and reasoning problem!
- Alas... BIG problem:

**September 22, 2011** 

The resource-mix is never optimal (un-utilized resources / missing resources)

shacham@stanford.edu

The configuration itself adds inefficiencies (registers, muxes, etc.)



SGAR · H.CHEN

## Need to Rethink Our Ways Again: Move The Configuration Step To Design Time





September 22, 2011

## Need to Rethink Our Ways Again: Move The Configuration Step To Design Time





September 22, 2011

## Put Our Understanding of The System In <u>A Tool – Chip Generator</u>





**September 22, 2011** 



## One Generator Template Encapsulates Systems



**September 22, 2011** 

## So We Can Explore The Design Space



**September 22, 2011** 



## Example: CMU's Spiral FFT Generator

Hardware and software co-optimization framework for DSP applications

- Given a transformation, Spiral can easily explore the implementation space
- Final result is optimal hardware and software for a given algorithm + constraints





Milder, Franchetti, Hoe and Püschel, DAC, 2008 http://www.spiral.net/hardware/dftgen.html

#### September 22, 2011

## Generalizing Hardware Optimization & Generation

Feedback

Synthesis



- Keep flexibility in sub modules Standardize interfacing with
- Encode cross-module dependencies
- Encode validation and software dependencies
- Encode physical / backend dependencies

- configuration tools
- Formal interface for architect / application designer / end client

architecture Optimization

**Circuit / Micro** 

- Repeat until target reached
- Repeat for different chips

#### shacham@stanford.edu

GDS2

PNR

• (Semi) Automatic

**September 22, 2011** 



## Standardizing The Creation Of Generators

- We Created *Genesis2* 
  - So we (and you!) can build generators
- Premise: Keep it simple Genesis2 is just an extension for SystemVerilog
  - Works like a pre-processor (\* but actually is doing a little more)
  - Remove artificial "synthesizability" limitations from the elaboration code
  - Make elaboration explicit; software-like
- Expressive: Encapsulate design trade-offs in blocks (make generators!)
  - Goal is for each block to have a small elaboration program a "constructor"
  - Enable late, externally driven, tweaking of knobs
- Instead of coding modules, we write programs that produce modules



## Genesis2 Code Snippets





## Example: Our Chip Multiprocessor Generator

- What would adding a processor require?
  - More MB's; Bigger PC; More fabric
- Changing a proc (e.g., scalar to vliw)?
  - Change the word width of the bus and the relevant memory block
- What about changing the protocol? Or adding a new memory operation?
- Conclusion:
  - Meaningful changes are not just local. They require system-wide modifications
  - Need to encode the system reasoning.



September 22, 2011

## Grow Beyond The Per-Module Scope

- Hierarchical: Build bigger generators out of smaller generators
  - Remember: each generator is the encoding of its designer's thought process and the encapsulation of many design trade-offs
  - So instantiate the generator; Not just one instance of "cemented" Verilog
- Scope: Open the system scope to the individual generators
  - To resolve structural constraints between modules

| my \$CacheObj = | Sub-Generator<br>name 🔌<br>generate('Cache', 'Dca |                  |
|-----------------|---------------------------------------------------|------------------|
|                 | PROCESSOR                                         | R => \$ProcObj); |

- Still need to control the internal knobs of the cache from the outside world
  - E.g., way size, associativity, etc.

**September 22, 2011** 



## Standardized Parameter I/O Through XML





## Standard Graphical User Interface

| tst2dut_cfg_ifc    dut2tst_cfg_ifc    User-tweakable parameters:    Inis is the mode of the generation. Also, it's a VERY LONG COMM.      ASSERTION    ON    Inis is the assertion mode of the generation.      QUAD_ID    0    In this example, can have up to 64 quads(!)      TILE_ID    0    Any number greater than 0, for illustration.                                                                                                                                                                                                                                                                                                                          | tst2dut_cfg_ifc    dut2tst_cfg_ifc    MODE    VERIF    This is the mode of the generation. Also, it's a VERY LONG COMME      ASSERTION    ON    This is the mode of the generation. Also, it's a VERY LONG COMME      QUAD_ID    O    In this example, can have up to 64 quads(!)      DUT    tb    NUM_PROCESSOR    1      NUM_MEM_MATS    1    no comment      Immutable parameters:    Immutable parameters:    Immutable parameters: | top                     | < > UP                               | Parameters for in          | istance "top"  |                                                                 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------------------------|----------------------------|----------------|-----------------------------------------------------------------|
| DUT    tb      Ib    NUM_PROCESSOR      NUM_MEM_MATS    1      Instruction    no comment      Instruction    no comment | DUT    tb    Initial is the mode of the generation.    Initial is the mode of the generation.      DUT    tb    NUM_PROCESSOR    1    Initial is the mode of the generation.      Intersection    NUM_MEM_MATS    1    Initial is the mode of the generation.                                                                                                                                                                            | -                       |                                      | User-tweakable para        | neters:        |                                                                 |
| DUT    tb      Immutable parameters:    Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | DUT    tb    QUAD_ID    0    In this example, can have up to 64 quads(I)      NUM_PROCESSOR    1    no comment      NUM_MEM_MATS    1    no comment      Immutable parameters:    Immutable parameters:    Immutable parameters:                                                                                                                                                                                                         | tst2dut_cfg_ifc         | dut2tst_cfg_ifc                      | MODE                       | VERIF          | This is the mode of the generation. Also, it's a VERY LONG COMM |
| DUT    tb    TILE_ID    0    Any number greater than 0, for illustration.      NUM_PROCESSOR    1    no comment      NUM_MEM_MATS    1    no comment      Immutable parameters:    Immutable parameters:    Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                                      | DUT  fb  TILE_ID  0  Any number greater than 0, for illustration.    NUM_PROCESSOR  1  no comment    NUM_MEM_MATS  1  no comment    Immutable parameters:  Immutable parameters:  Immutable parameters:                                                                                                                                                                                                                                  |                         |                                      | ASSERTION                  | ON             | This is the assertion mode of the generation.                   |
| DUT    tb    NUM_PROCESSOR    1    no comment      NUM_MEM_MATS    1    no comment      Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | DUT    tb    NUM_PROCESSOR    1    no commenti      NUM_MEM_MATS    1    no commenti      Immutable parameters:    Immutable parameters:    Immutable parameters:                                                                                                                                                                                                                                                                        |                         |                                      | QUAD_ID                    | 0              | In this example, can have up to 64 quads(!)                     |
| NUM_MEM_MATS  1  no comment    Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | NUM_FROCESSOR  no comment    NUM_MEM_MATS  1    Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                    |                         |                                      | TILE_ID                    | 0              | Any number greater than 0, for illustration.                    |
| Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Immutable parameters:                                                                                                                                                                                                                                                                                                                                                                                                                    | DUT                     | tb                                   | NUM_PROCESSOR              | 1              | no comment                                                      |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      | NUM_MEM_MATS               | 1              | no comment                                                      |
| Submit changes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Submit changes                                                                                                                                                                                                                                                                                                                                                                                                                           |                         |                                      | Immutable paramete         | rs:            |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            | Submit changes |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          |                         |                                      |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                          | ick here to download ta | ar of current design (.vp and .xml f | files) (must first "Submit | changes")      |                                                                 |
| ck here to download tar of current design (.vp and .xml files) (must first "Submit changes")                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | ck here to download tar of current design (.vp and .xml files) (must first "Submit changes")                                                                                                                                                                                                                                                                                                                                             | how walk-thru demo (    | CMP generator)                       |                            |                |                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | ck here to download tar of current design (.vp and .xml files) (must first "Submit changes")<br>Show walk-thru demo (CMP generator)                                                                                                                                                                                                                                                                                                      | Show download and en    | nbed options                         |                            |                |                                                                 |

Debug is OFF. Click here to turn on debugging.

+ Help

Work done with Steve Richardson, Megan Wachs, and Andrew Danowitz

#### September 22, 2011



## Genesis2 Now Also Used For CMU Generators

| ton toncontrol eg   | backprojection < > UP | Parameters for instance "top.topcontrol_eg.backprojection" |
|---------------------|-----------------------|------------------------------------------------------------|
| Carrier Contraction |                       | tweakable parameters:                                      |
| twi_inter_cos       | twi_inte_sin          | TRIPRECISION_LOG2 16                                       |
|                     |                       | INTER_RESOLUTION_LOG2 6                                    |
|                     |                       | ROI_startx 0                                               |
| e so ha s           |                       | ROI_starty 0                                               |
| inear_inter         | destimemory           | ROI_endx 31                                                |
|                     |                       | ROI_endy 31                                                |
|                     |                       | ROI_height 32                                              |
|                     |                       | ROI_width 32                                               |
|                     |                       | Immutable parameters:                                      |
|                     |                       | IMAGEWIDTH_LOG2 5                                          |
|                     |                       | RADONSIZE_LOG2 6                                           |
|                     |                       | IMAGEHEIGHT_LOG2 5                                         |
|                     |                       | RADONSIZE_ORIG 49                                          |
|                     |                       | PIXELPRECISION_LOG2 16                                     |
|                     |                       | ANGLESIZE_LOG2 6                                           |
|                     |                       |                                                            |
|                     |                       | Submit changes                                             |

Download tar of current design (.vp and .xml files) (must first "Submit changes")

**September 22, 2011** 

## Summary / Conclusions

- Power is today's greatest design constrain  $\rightarrow$  Calls for customization
- Costs of creating new <u>systems</u> prohibitive  $\rightarrow$  Due to system complexity
- Chip Generators encapsulate the designer's understanding of the system, as well as the design trade-offs
  - Constrained architecture built off (design-time) flexible blocks
  - The artifact produce is the process of making a chip, not a design instance
- Genesis2 is a simple extension to SystemVerilog that enables and standardize the creation of hierarchical chip generators

*Genesis2* can be downloaded from <a href="http://genesis2.stanford.edu/">http://genesis2.stanford.edu/</a>



## **THANK YOU!**

**September 22, 2011** 

shacham@stanford.edu

23



# **BACKUP SLIDES**

**September 22, 2011** 

shacham@stanford.edu

24



## Why SoC Costs Are Out Of Control?



## End Result 1: Better Amortized Cost Structure





**September 22, 2011** 

## End Result 2: Optimization At The System Level

- Example: Best processor architecture configuration at various power / performance budgets



## End Result 3: Verification Team Also Leverages The Generator





# THE END OF DENNARD SCALING

**September 22, 2011** 

shacham@stanford.edu

29

## Historical Technology Scaling





## Dennard's MOS Scaling (1974)



- In this ideal scaling ( $\alpha < 1$ )
  - V scales to  $\alpha$ V, L scales to  $\alpha$ L, W scales to  $\alpha$ W
  - So: C scales to αC,
  - E = V/L is constant, so i scales to  $\alpha$ i (i/m is stable)

**September 22, 2011** 



## The Triple Play of Historical Scaling

Over three decades of constant field (Dennard) scaling

•V scales to  $\alpha$ V, L scales to  $\alpha$ L, W scales to  $\alpha$ W ( $\alpha$ <1)

### Everybody wins:

- Get more transistors, more gates
- Gates get faster, delay scales as
- Energy per switch is reduced
- For the same power and area as the previous design
  - Compute grows as  $1/\alpha^3$
- Architects use this to improve computer performance

 $\alpha^3$ 



# **RANDOM BACKUP SLIDES**

**September 22, 2011** 

shacham@stanford.edu

33



## Turn The Design Process Inside Out

#### Conventional System-on-Chip Design:

- Designer creates system from complex components
  - IP components are designed in advance
  - End with a new connection of fixed components
- SoC designer has to connect multiple IP blocks:
  - Interface adapters between different blocks
  - Complex verification
- Chip Generator:
  - Designer tunes parts in a "fixed" system architecture
    - Fixed design of squashy components
  - Functional interfaces remain constant
    - Reusable validation



## Silicon Compilers Again?

### Well, yes and no

- No: I don't expect it will take application code and "compile" it
- Yes: Each time we create a generator we essentially define a "language" that accepts an architectural "description," and "compiles" it to silicon
- Designers today encode results; We don't encode our knowledge
  - We think of many alternatives in a design; then choose one
    - For the next application perhaps another alternative is best
  - We optimize designs at each level, but then freeze them
    - What happens if we let the design remain flexible?
- Instead, embed explicit elaboration instructions into modules
  - Same function that constructors provide for classes



## Semiconductors Logic Design Starts



Source: Gartner/Dataquest Market Trends Compiled by Walden C. Rhines, CEO, Mento Graphics \* Without programmable logic

**September 22, 2011** 



## Semiconductor Market Size

### **Industry Revenue Growth: SIA Forecast**





Source: SIA November 2010 Forecast

#### September 22, 2011

## Semiconductor Market Breakdown (2010)



Note: Military is <1% and is included in Industrial.

#### September 22, 2011