# Geometric Programming and Its Applications to EDA Problems 

Stephen Boyd Seung Jean Kim<br>S. S. Mohan

## Outline

- Basic approach
- Geometric programming \& generalized geometric programming
- Digital circuit design applications
- Analog and RF circuit design applications
- Monomial and posynomial fitting
- Conclusions


## Basic Approach

## Basic approach

1. formulate circuit design problem as geometric program (GP), an optimization problem with special form
2. solve GP using specialized, tailored method

- this tutorial focuses on step 1 (a.k.a. GP modeling)
- step 2 is technology


## Why?

- we can solve even large GPs very effectively, using recently developed methods
- so once we have a GP formulation, we can solve circuit design problem effectively
we will see that
- GP is especially good at handling a large number of concurrent constraints
- GP formulation is useful even when it is approximate


## Trade-offs in optimization

- general trade-off between generality and effectiveness
- generality
- number of problems that can be handled
- accuracy of formulation
- ease of formulation
- effectiveness
- speed of solution, scale of problems that can be handled
- global vs. local solutions
- reliability, baby-sitting, starting point


## Example: least-squares vs. simulated annealing

least-squares

- large problems reliably (globally) solved quickly
- no initial point, no algorithm parameter tuning
- solves very restricted problem form
- with tricks and extensions, basis of vast number of methods that work (control, filtering, regression, . . . )
simulated annealing
- can be applied to any problem (more or less)
- slow, needs tuning, babysitting; not global in practice
- method of choice for some problems you can't handle any other way


## Where GP fits in

somewhere in between, closer to least-squares . . .

- like least-squares, large problems can be solved reliably (globally), no starting point, tuning, . . .
- solves a class of problems broader than least-squares, less general than simulated annealing
- formulation takes effort, but is fun and has high payoff


## Geometric Programming \& Generalized Geometric Programming

## Monomial \& posynomial functions

$x=\left(x_{1}, \ldots, x_{n}\right):$ vector of positive optimization variables

- function $g$ of form

$$
g(x)=c x_{1}^{\alpha_{1}} x_{2}^{\alpha_{2}} \cdots x_{n}^{\alpha_{n}}
$$

with $c>0, \alpha_{i} \in \mathbf{R}$, is called monomial

- sum of monomials, i.e., function $f$ of form

$$
f(x)=\sum_{k=1}^{t} c_{k} x_{1}^{\alpha_{1 k}} x_{2}^{\alpha_{2 k}} \cdots x_{n}^{\alpha_{n k}}
$$

with $c_{k}>0, \alpha_{i k} \in \mathbf{R}$, is called posynomial

## Examples

with $x, y, z$ variables,

- $0.23,2 z \sqrt{x / y}, 3 x^{2} y^{-.12} z$ are monomials (hence also posynomials)
- $0.23+x / y, 2(1+x y)^{3}, 2 x+3 y+2 z$ are posynomials
- $2 x+3 y-2 z, x^{2}+\tan x$ are neither


## Generalized posynomials

$f$ is a generalized posynomial if it can be formed using addition, multiplication, positive power, and maximum, starting from posynomials

## examples:

- $\max \left\{1+x_{1}, 2 x_{1}+x_{2}^{0.2} x_{3}^{-3.9}\right\}$
- $\left(0.1 x_{1} x_{3}^{-0.5}+x_{2}^{1.7} x_{3}^{0.7}\right)^{1.5}$
- $\left(\max \left\{1+x_{1}, 2 x_{1}+x_{2}^{0.2} x_{3}^{-3.9}\right\}\right)^{1.7}+x_{2}^{1.1} x_{3}^{3.7}$


## Composition rules

- monomials closed under product, division, positive scaling, power, inverse
- posynomials closed under sum, product, positive scaling, division by monomial, positive integer power
- generalized posynomials closed under sum, product, max, positive scaling, division by monomial, positive power


## Generalized geometric program (GGP)

$$
\begin{array}{ll}
\operatorname{minimize} & f_{0}(x) \\
\text { subject to } & f_{i}(x) \leq 1, \quad i=1, \ldots, m \\
& g_{i}(x)=1, \quad i=1, \ldots, p
\end{array}
$$

$f_{i}$ are generalized posynomials, $g_{i}$ are monomials

- called geometric program (GP) when $f_{i}$ are posynomials
- a highly nonlinear constrained optimization problem


## GP example

- maximize volume of box with width $w$, height $h$, depth $d$
- subject to limits on wall and floor areas, aspect ratios $h / w, d / w$

$$
\begin{array}{ll}
\operatorname{maximize} & h w d \\
\text { subject to } & 2(h w+h d) \leq A_{\text {wall }}, \quad w d \leq A_{\text {flr }} \\
& \alpha \leq h / w \leq \beta, \quad \gamma \leq d / w \leq \delta
\end{array}
$$

in standard GP form:

$$
\begin{array}{ll}
\operatorname{minimize} & h^{-1} w^{-1} d^{-1} \\
\text { subject to } & \left(2 / A_{\text {wall }}\right) h w+\left(2 / A_{\text {wall }}\right) h d \leq 1, \quad\left(1 / A_{\text {fir }}\right) w d \leq 1 \\
& \alpha h^{-1} w \leq 1, \quad(1 / \beta) h w^{-1} \leq 1 \\
& \gamma w d^{-1} \leq 1, \quad(1 / \delta) w^{-1} d \leq 1
\end{array}
$$

## GGP example: Floor planning

- choose cell widths, heights
- fixed cell areas
- (1 left of 2 ) above (3 left of 4)
- aspect ratio constraints
- minimize bounding box area


$$
\begin{array}{ll}
\operatorname{minimize} & h w \\
\text { subject to } & h_{i} w_{i}=A_{i}, \quad 1 / \alpha_{\max } \leq h_{i} / w_{i} \leq \alpha_{\max }, \\
& \max \left\{h_{1}, h_{2}\right\}+\max \left\{h_{3}, h_{4}\right\} \leq h, \\
& \max \left\{w_{1}+w_{2}, w_{3}+w_{4}\right\} \leq w
\end{array}
$$

. . . a GGP

## Trade-off analysis

(no equality constraints, for simplicity)

- form perturbed version of original GP, with changed righthand sides:

$$
\begin{array}{ll}
\operatorname{minimize} & f_{0}(x) \\
\text { subject to } & f_{i}(x) \leq u_{i}, \quad i=1, \ldots, m
\end{array}
$$

- $u_{i}>1\left(u_{i}<1\right)$ means $i$ th constraint is relaxed (tightened)
- let $p(u)$ be optimal value of perturbed problem
- plot of $p$ vs. $u$ is (globally) optimal trade-off surface (of objective against constraints)


## Trade-off curves for maximum volume box example



- maximum volume $V$ vs. $A_{\text {frr }}$, for $A_{\text {wall }}=100,1000,10000$
- $h / w, d / w$ aspect ratio limits $0.5,2$


## Sensitivity analysis

- optimal sensitivity of $i$ th constraint is

$$
S_{i}=\left.\frac{\partial p / p}{\partial u_{i} / u_{i}}\right|_{u=1}
$$

- $S_{i}$ predicts fractional change in optimal objective value if $i$ th constraint is (slightly) relaxed or tightened
- very useful in practice; give quantitative measure of how tight a binding constraint is
- when we solve a GP we get all optimal sensitivities at no extra cost


## Example

- minimize circuit delay, subject to power, area constraints (details later)

$$
\begin{array}{ll}
\operatorname{minimize} & D(x) \\
\text { subject to } & P(x) \leq P^{\max }, \quad A(x) \leq A^{\max }
\end{array}
$$

- both constraints tight at optimal $x^{\star}: P\left(x^{\star}\right)=P^{\max }, A\left(x^{\star}\right)=A^{\text {max }}$
- suppose optimal sensitivities are $S^{\mathrm{pwr}}=-2.1, S^{\text {area }}=-0.3$
- we predict:
- for $1 \%$ increase in allowed power, optimal delay decreases 2.1\%
- for $1 \%$ increase in allowed area, optimal delay decreases $0.3 \%$


## GP and GGP attributes

- after log transform of variables/constraints, they become convex problems
- can convert GGP to GP, e.g., $f(x)+\max \{g(x), h(x)\} \leq 1$ becomes

$$
f(x)+t \leq 1, \quad g(x) / t \leq 1, \quad h(x) / t \leq 1
$$

where $t$ is new (dummy) variable

- conversion tricks can be automated
- parser scans problem description, forms GP
- efficient GP solver solves GP
- solution transformed back (dummy variables eliminated)


## How GPs are solved

the practical answer: none of your business
more politely: you don't need to know
it's technology:

- good algorithms are known
- good software implementations are available


## How GPs are solved

- work with log of variables: $y_{i}=\log x_{i}$
- take log of monomials/posynomials to get

$$
\begin{array}{ll}
\operatorname{minimize} & \log f_{0}\left(e^{y}\right) \\
\text { subject to } & \log f_{i}\left(e^{y}\right) \leq 0, \quad i=1, \ldots, m \\
& \log g_{i}\left(e^{y}\right)=0, \quad i=1, \ldots, p
\end{array}
$$

- $\log f_{i}\left(e^{y}\right)$ are (smooth) convex functions
- $\log g_{i}\left(e^{y}\right)$ are affine functions, i.e., linear plus a constant
- solve (nonlinear) convex optimization problem above using interior-point method


## Current state of the art

- basic interior-point method that exploits sparsity, generic GP structure
- approaching efficiency of linear programming solver
- sparse 1000 vbles, 10000 monomial terms: few seconds
- sparse 10000 vbles, 100000 monomial terms: minute
- sparse $10^{6}$ vbles, $10^{7}$ monomial terms: hour
(these are order-of-magnitude estimates, on simple PC)


## History

- GP (and term 'posynomial') introduced in 1967 by Duffin, Peterson, Zener
- engineering applications from the very beginning
- early applications in chemical, mechanical, power engineering
- digital circuit transistor and wire sizing with Elmore delay since 1984 (Fishburn \& Dunlap's TILOS)
- analog circuit design since 1997 (Hershenson, Boyd, Lee)
- other applications in finance, wireless power control, statistics, . . .
- extremely efficient solution methods since 1994 or so (Nesterov \& Nemirovsky)


## Mixed-integer geometric program

$$
\begin{array}{ll}
\operatorname{minimize} & f_{0}(x) \\
\text { subject to } & f_{i}(x) \leq 1, \quad i=1, \ldots, m \\
& g_{i}(x)=1, \quad i=1, \ldots, p \\
& x_{i} \in \mathcal{D}_{i}, \quad i=1, \ldots, k
\end{array}
$$

- $f_{i}$ are generalized posynomials, $g_{i}$ are monomials
- $\mathcal{D}_{i}$ are discrete sets, e.g., $\{1,2,3,4, \ldots\}$ or $\{1,2,4,8 \ldots\}$
- very hard to solve exactly; all methods make some compromise (compared to methods for GP)
- heuristic methods attempt to find good approximate solutions quickly, but cannot guarantee optimality
- global methods always find the global solution, but can be extremely slow

Digital Circuit Design Applications

## Gate scaling



- combinational logic; circuit topology \& gate types given
- gate sizes (scale factors $x_{i} \geq 1$ ) to be determined
- scale factors affect total circuit area, power and delay


## RC gate delay model



- input \& intrinsic capacitances, driving resistance, load capacitance

$$
C_{i}^{\mathrm{in}}=\bar{C}_{i}^{\mathrm{in}} x_{i}, \quad C_{i}^{\mathrm{int}}=\bar{C}_{i}^{\mathrm{int}} x_{i}, \quad R_{i}=\bar{R}_{i} / x_{i}, \quad C_{i}^{\mathrm{L}}=\sum_{j \in \mathrm{FO}(i)} C_{j}^{\mathrm{in}}
$$

## RC gate model

- RC gate delay:

$$
D_{i}=0.69 R_{i}\left(C_{i}^{\mathrm{L}}+C_{i}^{\text {int }}\right)=0.69\left(\bar{R}_{i} \bar{C}_{i}^{\text {in }}+\left(\bar{R}_{i} / x_{i}\right) \sum_{j \in \operatorname{FO}(i)} \bar{C}_{j}^{\text {in }} x_{j}\right)
$$

- $D_{i}$ are posynomials (of scale factors)


## Path and circuit delay



- delay of a path: sum of delays of gates on path
. . . posynomial
- circuit delay: maximum delay over all paths
. . . generalized posynomial


## Area \& power

- total circuit area: $A=x_{1} \bar{A}_{1}+\cdots+x_{n} \bar{A}_{n}$
- total power is $P=P_{\text {dyn }}+P_{\text {stat }}$
- dynamic power $P_{\mathrm{dyn}}=\sum_{i=1}^{n} f_{i}\left(C_{i}^{\mathrm{L}}+C_{i}^{\mathrm{int}}\right) V_{\mathrm{dd}}^{2}$
$f_{i}$ is gate switching frequency
- static power $P_{\text {stat }}=\sum_{i=1}^{n} x_{i} \bar{I}_{i}^{\text {leak }} V_{\mathrm{dd}}$
$\bar{I}_{i}^{\text {leak }}$ is leakage current (average over input states) of unit scaled gate
- $A$ and $P$ are linear functions of $x$, with positive coefficients, hence posynomials


## Basic gate scaling problem

```
minimize D
subject to }P\leq\mp@subsup{P}{}{\operatorname{max}},\quadA\leq\mp@subsup{A}{}{\operatorname{max}
    1\leq xi, i=1,\ldots,n
```


## . . . a GGP

extensions/variations:

- minimize area, power, or some combination
- maximize clock frequency subject to area, power limits
- add other constraints
- optimal trade-off of area, power, delay


## Clock frequency maximization

- $f_{\text {clk }}$ is variable
- timing requirement: $D \leq 0.8 / f_{\text {clk }}$ ( $20 \%$ margin for flip-flop delay, setup time, clock skew . . . )
- $P$ is posynomial of scalings and $f_{\mathrm{clk}}$, assuming $f_{i}$ scale with $f_{\mathrm{clk}}$

$$
\begin{array}{ll}
\operatorname{maximize} & f_{\mathrm{clk}} \\
\text { subject to } & P \leq P^{\max }, \quad A \leq A^{\max }, \quad(1 / 0.8) D f_{\mathrm{clk}} \leq 1 \\
& 1 \leq x_{i}, \quad i=1, \ldots, n
\end{array}
$$

. . . a GGP

## Example: 32-bit Ladner-Fisher adder

- 451 gates (scale factors), 5 gate types, 64 inputs, 32 outputs
- logical effort gate delay model parameters:

| gate type | $\bar{C}^{\text {in }}$ | $\bar{C}^{\text {int }}$ | $\bar{R}$ | $\bar{A}$ | $\bar{I}^{\text {leak }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| INV | 3 | 3 | 0.48 | 3 | 0.006 |
| NAND2 | 4 | 6 | 0.48 | 8 | 0.007 |
| NOR2 | 5 | 6 | 0.48 | 10 | 0.009 |
| AOI21 | 6 | 7 | 0.48 | 17 | 0.003 |
| OAI21 | 6 | 7 | 0.48 | 16 | 0.003 |

- time unit is $\tau$, delay of min-size inverter $(0.69 \cdot 0.48 \cdot 3=1)$
- area (total width) unit is width of NMOS in min-size inverter


## Example: 32-bit Ladner-Fisher adder

- typical optimization time: few seconds on PC



## 32-bit Ladner-Fisher adder with discrete scale factors

- add constraints $x_{i} \in\{1,2,4,8,16, \ldots\}$
- simple rounding of optimal continuous scalings



## Sparse GP gate scaling problem

$$
\begin{array}{ll}
\operatorname{minimize} & D \\
\text { subject to } & T_{j} \leq D \quad \text { for } j \text { an output gate } \\
& T_{j}+D_{i} \leq T_{i} \quad \text { for } j \in \mathrm{FI}(i) \\
& P \leq P^{\max }, \quad A \leq A^{\max } \\
& 1 \leq x_{i}, \quad i=1, \ldots, n
\end{array}
$$

- $T_{i}$ are upper bounds on signal arrival times
- extremely sparse GP; can be solved very efficiently


## Better (generalized posynomial) models

can greatly improve model, while retaining GP compatibility (hence efficient global solution)

- area, delay, power can be any generalized posynomials of scale factors, e.g.,

$$
D_{i}=a_{i}+b_{i}\left(C_{i}^{\mathrm{L}}\right)^{1.05} x_{i}^{-0.9}, \quad P_{i}=c_{i}+d_{i}\left(C_{i}^{\mathrm{L}}\right)^{1.2}+e_{i} x_{i}^{1.1}
$$

- these can be found by more refined analysis, or fitting generalized posynomials to simulation/characterization data


## Distinguishing gate transitions

- can distinguish rising and falling transitions, with different delay, energy, $C^{\text {in }}$, for each gate input/transition
- (bounds on) signal arrival times can be propagated through recursions, e.g.,

$$
T_{i}^{\mathrm{r}}=\max _{j \in \mathrm{FI}(i)}\left\{T_{j}^{\mathrm{r}}+D_{j i}^{\mathrm{rr}}, T_{j}^{\mathrm{f}}+D_{j i}^{\mathrm{fr}}\right\}, \quad T_{i}^{\mathrm{f}}=\max _{j \in \mathrm{FI}(i)}\left\{T_{j}^{\mathrm{r}}+D_{j i}^{\mathrm{rf}}, T_{j}^{\mathrm{f}}+D_{j i}^{\mathrm{ff}}\right\}
$$

- gate scaling problem more complex, but still a GGP (hence can be efficiently solved)


## Modeling signal slopes

- associate (worst-case) output signal transition time $\tau$ with each gate
- model delay, energy, input capacitance as (generalized posynomial) functions of scale factor, load capacitance, input transition time
- propagate output transition time using (generalized posynomial) function of scale factor, load capacitance, input transition time
- common model:

$$
D_{i}=a_{i} C_{i}^{\mathrm{L}} / x_{i}+\kappa_{i} \tau_{i}^{\mathrm{in}}, \quad E_{i}=b_{i}\left(C_{i}^{\mathrm{L}}+c_{i} x_{i}\right)+\lambda_{i} x_{i} \tau_{i}^{\mathrm{in}}, \quad \tau_{i}=\nu_{i} D_{i}
$$

- gate scaling problem still a GGP


## Design with a standard library

- circuit topology is fixed; choose size for each gate from discrete library
- a combinatorial optimization problem, difficult to solve exactly
- GP approach
- for each gate type in library, fit given library data to find GP-compatible models of delay, power, . . .
- size with continuous fitted models, using GP
- snap continuous scale factors back to standard library


## Robust design over corners

- have $K$ corners or scenarios, e.g., combinations of
- process parameters (channel length, oxide thickness, . . .)
- environmental parameters (supply voltage, temperature, . . . )
- for each corner have (slightly) different models for delay, power, . . .
- robust design finds gate scalings that work well for all corners


## Robust design over corners

- basic (worst-case) robust design over corners:

$$
\begin{array}{ll}
\operatorname{minimize} & D^{\mathrm{wc}}=\max \left\{D^{(1)}, \ldots, D^{(K)}\right\} \\
\text { subject to } & P^{(1)}(x) \leq P^{\max }, \ldots, P^{(K)}(x) \leq P^{\max } \\
& A \leq A^{\max } \\
& 1 \leq x_{i}, \quad i=1, \ldots, n
\end{array}
$$

- many variations, e.g., minimize average delay over corners,

$$
D^{\mathrm{avg}}=(1 / K)\left(D^{(1)}+\cdots+D^{(K)}\right)
$$

- results in (very large, but sparse) GGP


## Multiple-scenario design

- have $K$ scenarios or operating modes, with $K$ models for $P, D, \ldots$
- scenarios are combinations of
- supply \& threshold voltages
- clock frequency
- specifications \& constraints
- like corner-based robust design, but scenarios are intentional
- find one set of gate scalings that work well in all scenarios


## Example

- find single set of gate scalings to support both high performance mode and low power mode
- in high performance mode: $P^{\text {fast }} \leq \bar{P}^{\text {fast }}, D^{\text {fast }} \leq \bar{D}^{\text {fast }}$
- in low power mode: $P^{\text {slow }} \leq \bar{P}^{\text {slow }}, D^{\text {slow }} \leq \bar{D}^{\text {slow }}$

$$
\begin{array}{ll}
\operatorname{minimize} & A \\
\text { subject to } & P^{\text {slow }} \leq \bar{P}^{\text {slow }}, \quad D^{\text {slow }} \leq \bar{D}^{\text {slow }} \\
& P^{\text {fast }} \leq \bar{P}^{\text {fast }}, \quad D^{\text {fast }} \leq \bar{D}^{\text {fast }} \\
& 1 \leq x_{i}, \quad i=1, \ldots, n
\end{array}
$$

## . . . a GGP

## Statistical parameter variation

- circuit peformance depends on random device and process parameters
- hence, performance measures like $P, D$ are random variables $\mathbf{P}, \mathbf{D}$
- delay $\mathbf{D}$ is max of many random variables; often skewed to right
- distributions of $\mathbf{P}, \mathbf{D}$ depend on gate scalings $x_{i}$

- related to (parametric) yield, DFM, DFY . . .


## Statistical design

- measure random performance measures by $95 \%$ quantile (say)

$$
\begin{array}{ll}
\operatorname{minimize} & \mathbf{Q}^{\cdot 95}(\mathbf{D}) \\
\text { subject to } & \mathbf{Q}^{95}(\mathbf{P}) \leq P^{\max }, \quad A \leq A^{\max } \\
& 1 \leq x_{i}, \quad i=1, \ldots, n
\end{array}
$$

- extremely difficult stochastic optimization problem; almost no analytic/exact results
- but, (GP-compatible) heuristic method works well


## Statistical model

- for simplicity consider $V_{\text {th }}$ variation only
- Pelgrom's model: $\sigma_{V_{\text {th }}}=\bar{\sigma}_{V_{\text {th }}} x^{-1 / 2}$
- alpha-power law model: $D \propto V_{\mathrm{dd}} /\left(V_{\mathrm{dd}}-V_{\mathrm{th}}\right)^{\alpha}$, with $\alpha \approx 1.3$
- for small variation in $V_{\text {th }}$,

$$
\sigma_{D} \approx\left|\frac{\partial D}{\partial V_{\mathrm{th}}}\right| \sigma_{V_{\mathrm{th}}}=\alpha\left(V_{\mathrm{dd}}-V_{\mathrm{th}}\right)^{-1} \bar{\sigma}_{V_{\mathrm{th}}} x^{-0.5} D
$$

- $\sigma_{D}$ is posynomial
- get similar (posynomial) models for $\sigma_{D}$ with more complex gate delay statistical models


## Heuristic for statistical design

- assume generalized posynomial models for gate delay mean $D_{i}(x)$ and variance $\sigma_{i}(x)^{2}$
- optimize using surrogate gate delays

$$
\tilde{D}_{i}(x)=D_{i}(x)+\kappa_{i} \sigma_{i}(x)
$$

$\kappa_{i} \sigma_{i}(x)$ are margins on gate delays ( $\kappa_{i}$ is typically 2 or 3 )

- verify statistical performance via Monte Carlo analysis (can update $\kappa_{i}$ 's and repeat)


## Heuristic for statistical design

heuristic statistical design

- often far superior to design obtained ignoring statistical variation
- not very sensitive to details of process variation statistics (distribution shape, correlations, . . .)
- below: 32-bit Ladner-Fisher adder, Pelgrom variance model



## Path delay mean/std. dev. scatter plots



## Joint size and supply/threshold voltage optimization

- goal: jointly optimize gate size, supply and threshold voltages via GGP
- need to: model delay, power as generalized posynomial functions of gate size, supply and threshold voltages


## Generalized posynomial delay model

- alpha-power law model predicts variation in gate delay with $V_{\mathrm{dd}}, V_{\mathrm{th}}$ :

$$
D_{i}=\frac{V_{\mathrm{dd}, i}}{\left(V_{\mathrm{dd}, i}-V_{\mathrm{th}, i)^{\alpha}}\right.} \tilde{D}_{i}(x)
$$

$\tilde{D}_{i}$ is generalized posynomial gate delay model, function of scalings $x$

- generalized posynomial approximation

$$
\begin{aligned}
& \qquad \widehat{D}_{i}=V_{\mathrm{dd}, i}^{1-\alpha}\left(1+V_{\mathrm{th}, i} / V_{\mathrm{dd}, i}+\cdots+\left(V_{\mathrm{th}, i} / V_{\mathrm{dd}, i}\right)^{5}\right)^{\alpha} \tilde{D}_{i}(x) \\
& \text { error under } 1 \% \text { for } V_{\mathrm{dd}, i} \geq 2 V_{\mathrm{th}, i}, 1.3 \leq \alpha \leq 2
\end{aligned}
$$

## Generalized posynomial power model

- gate dynamic power: $P_{\mathrm{dyn}}=\sum_{i=1}^{n} f_{i}\left(C_{i}^{\mathrm{L}}+C_{i}^{\mathrm{int}}\right) V_{\mathrm{dd}, i}^{2}$
- simple static power model:

$$
P_{\text {stat }}=\sum_{i=1}^{n} x_{i} \bar{I}_{i}^{\text {leak }} V_{\mathrm{dd}, i}, \quad \bar{I}_{i}^{\text {leak }} \propto e^{-\left(V_{\mathrm{th}, i}-\gamma V_{\mathrm{dd}, i)}\right) / V_{0}}
$$

$\gamma, V_{0}$ are (process) constants

- $P_{\text {stat }}$ (by itself) cannot be approximated well by a generalized posynomial over large range of $V_{\mathrm{dd}}, V_{\mathrm{th}}$
- but, total power $P=P_{\mathrm{dyn}}+P_{\text {stat }}$ can be approximated well by a generalized posynomial


## Generalized posynomial power model example

total power $P=V_{\mathrm{dd}}^{2}+30 V_{\mathrm{dd}} e^{-\left(V_{\mathrm{th}}-0.06 V_{\mathrm{dd}}\right) / 0.039}$ (up to scaling)



- generalized posynomial approximation
$\widehat{P}=V_{\mathrm{dd}}^{2}+0.06 V_{\mathrm{dd}}\left(1+0.0031 V_{\mathrm{dd}}\right)^{500}\left(V_{\mathrm{th}} / 0.039\right)^{-6.16}$
- error under $3 \%$ (well under accuracy of model!)


## Joint optimization of gate sizes, $V_{\mathrm{dd}}$, \& $V_{\mathrm{th}}$

basic problem, with variables: $x_{i}, V_{\mathrm{th}, i}, V_{\mathrm{dd}, i}$

$$
\begin{array}{ll}
\operatorname{minimize} & D \\
\text { subject to } & P \leq P^{\max }, \quad A \leq A^{\max } \\
& V_{\mathrm{th}}^{\min } \leq V_{\mathrm{th}, i} \leq V_{\mathrm{th}}^{\max }, \quad i=1, \ldots, n \\
& V_{\mathrm{dd}}^{\min } \leq V_{\mathrm{dd}, i} \leq V_{\mathrm{dd}}^{\max }, \quad i=1, \ldots, n \\
& \text { other constraints } \ldots
\end{array}
$$

## (. . . a GGP)

discrete allowed $V_{\text {dd }}, V_{\text {th }}$ values yields MIGP

## Extensions/variations

- clustering, with single $V_{\mathrm{dd}}, V_{\mathrm{th}}$ per cluster:

$$
V_{\mathrm{dd}, i}=V_{\mathrm{dd}, j}, \quad V_{\mathrm{th}, i}=V_{\mathrm{th}, j} \quad \text { for } i, j \text { in same cluster }
$$

... monomial (equality) constraints

- clustered voltage scaling (CVS): low $V_{\mathrm{dd}}$ cells cannot drive high $V_{\mathrm{dd}}$ cells

$$
V_{\mathrm{dd}, j} \leq V_{\mathrm{dd}, i} \quad \text { for } j \in \mathrm{FO}(i)
$$

... monomial (inequality) constraints

- multimode design: choose single set of gate scalings, different $V_{d d}^{(k)}$, $V_{\mathrm{th}}^{(k)}$ for each scenario $k=1, \ldots, K$ related to dynamic voltage scaling, adaptive bulk biasing, . . .


## Joint optimization examples

- Ladner-Fisher adder
- variables: gate scalings $x_{i}$, supply voltages $V_{\mathrm{dd}, i}$, threshold voltages $V_{\mathrm{th}, i}$
- four delay-power trade-off curves:
- fixed $V_{\mathrm{dd}, i}=1.0$, fixed $V_{\mathrm{th}, i}=0.3$
- fixed $V_{\mathrm{dd}, i}=1.0$, variable $V_{\mathrm{th}, i} \in\{0.2,0.3,0.4\}$
- CVS with $V_{\mathrm{dd}, i} \in\{0.6,1.0\}, V_{\mathrm{th}, i} \in\{0.2,0.3,0.4\}$
- variable continuous $V_{\mathrm{dd}}, V_{\mathrm{th}}, 0.6 \leq V_{\mathrm{dd}, i} \leq 1.0,0.2 \leq V_{\mathrm{th}, i} \leq 0.4$ (not practical, but serves as lower bound)


## Trade-off curve analysis



## Design with multiple threshold voltages



## Clustered voltage scaling



## Wire and device sizing

RC tree:


- $R_{i} \mathrm{~s}$ and $C_{i} \mathrm{~s}$ are generalized posynomials of some underlying variables $x$


## Elmore delay

- Elmore delay at node $i$ :

$$
D_{i}=\int_{0}^{\infty} v_{i}(t) d t
$$

area under voltage curve, when voltages are initialized as $v_{i}(0)=1$


- Elmore delay of RC tree is $D=\max \left\{D_{1}, \ldots, D_{N}\right\}$


## Elmore delay expression

- analytic expression for Elmore delay $D_{i}$

$$
D_{i}=\sum_{j \in \mathbf{P}(i)} R_{i} C_{i}^{\text {tot }}
$$

- $\mathbf{P}(i)$ is path from root to node $i$
- $C_{i}^{\text {tot }}$ is the total capacitance downstream from node $i$ (including $C_{i}$ )
- $D_{i}$ is posynomial of $x$
- $D$ is generalized posynomial of $x$


## RC tree optimization

- minimize RC tree delay subject to (generalized posynomial) constraints

$$
\begin{array}{ll}
\operatorname{minimize} & D \\
\text { subject to } & f_{i}(x) \leq 0, \quad i=1, \ldots, m
\end{array}
$$

. . . a GGP

- sparse formulation:

$$
\begin{array}{ll}
\operatorname{minimize} & s \\
\text { subject to } & s \geq D_{i}, \quad i=1, \ldots, n \\
& C_{j}^{\text {tot }} \geq \sum_{i \in \operatorname{Child}(j)} C_{i}^{\text {tot }}+C_{j}, \quad i=1, \ldots, n \\
& D_{i} \geq D_{\operatorname{Par}(k)}+R_{i} C_{i}^{\text {tot }}, \quad i=1, \ldots, n \\
& f_{i}(x) \leq 0, \quad i=1, \ldots, m
\end{array}
$$

## Wire sizing

- choose wire segment widths $w_{i}, \ldots, w_{N}$ in an interconnect network
- optimize delay, area



## $\pi$ model for wire segment



- wire resistance and capacitances

$$
R_{i}=\alpha_{i} \frac{l_{i}}{w_{i}}, \quad \bar{C}_{i}=\beta_{i} l_{i} w_{i}+\gamma_{i} l_{i},
$$

- with $\pi$ model, interconnect network becomes RC tree, with $R_{i} \mathrm{~s}$ and $C_{i} \mathrm{~s}$ posynomial functions of wire segment widths $w_{i}$


## Wire sizing via GP

```
minimize D
subject to }\mp@subsup{w}{i}{min}\leq\mp@subsup{w}{i}{}\leq\mp@subsup{w}{i}{max},\quadi=1,\ldots,
    l}\mp@subsup{l}{1}{}\mp@subsup{w}{1}{}+\cdots+\mp@subsup{l}{N}{}\mp@subsup{w}{N}{}\leq\mp@subsup{A}{}{\operatorname{max}
```

. . . a GGP

- can easily optimize interconnect network with 10000 wires, using sparse GP formulation
- can use more accurate generalized posynomial models of $R_{i}, C_{i}$


## Device sizing

- devices (and wire segments) are sized individually
- replace each device with switch-level RC model
- each transition is associated with RC tree
- use Elmore delay to measure delay of transition
- . . problem is GGP


## Switch-level RC device model



- crude linear approximation of device, for delay and power optimization
- $R$, all $C$ s are generalized posynomials of device width
- we'll ignore $C_{\mathrm{gd}}$ (but can be incorporated via Miller effect . . .)


## Example: 2-input NAND



## Example transition

- transition: B falls from $V_{\mathrm{dd}}$ to zero; A remains at $V_{\mathrm{dd}}$
- associated RC tree:


$$
C_{1}=C_{\mathrm{db} 1}+C_{\mathrm{db} 2}+C_{\mathrm{db} 3}, C_{2}=C_{\mathrm{sb} 3}+C_{\mathrm{db} 4}
$$

- Elmore delay: $D=R_{\mathrm{sd} 1}\left(C^{\mathrm{L}}+C_{1}+C_{2}\right)$
- energy lost: $E=\left(C^{\mathrm{L}}+C_{1}+C_{2}\right) V_{\mathrm{dd}}^{2} / 2$

Analog and RF Circuit Design Applications

## Large signal MOS model



- gate overdrive voltage $V_{\mathrm{gov}}=V_{\mathrm{gs}}-V_{\mathrm{th}}$
- saturation condition: $V_{\mathrm{ds}} \geq V_{\text {dsat }}=V_{\text {gov }}$ ( $V_{\text {dsat }}$ is minimum drain-source voltage for device to operate in saturation)
- square-law model $I=0.5 \mu C_{\text {ox }}(W / L) V_{\text {gov }}^{2}$
- GP model variables: $I, L, W$
- $V_{\text {gov }}=\left(\mu C_{\mathrm{ox}} / 2\right)^{-1 / 2} I^{1 / 2} L^{1 / 2} W^{-1 / 2}$ is monomial
- $V_{\mathrm{gs}}=V_{\mathrm{gov}}+V_{\mathrm{th}}$ is posynomial


## Small signal dynamic MOS model



- transconductance $g_{\mathrm{m}}=\left(2 \mu C_{\mathrm{ox}}\right)^{1 / 2} I^{1 / 2} L^{-1 / 2} W^{1 / 2}$ is monomial
- output conductance $g_{\mathrm{o}}=\lambda I$ is monomial
- all capacitances are (approximately) posynomial in $I, L, W$
- better (GP-compatible) models can be obtained by fitting data from accurate models or measurements


## Example: monomial $g_{\mathrm{m}}$ model

- monomial model of $g_{\mathrm{m}}$ for I/O NMOS device in a $0.13 \mu \mathrm{~m}$ technology
- 11000 data points (from BSIM3) over ranges
$-0.3 \mu \mathrm{~m} \leq L \leq 3 \mu \mathrm{~m}, \quad 2 \mu \mathrm{~m} \leq W \leq 20 \mu \mathrm{~m}$
$-0.7 \mathrm{~V} \leq V_{\mathrm{gs}} \leq 1.7 \mathrm{~V}, \quad V_{\mathrm{dsat}} \leq V_{\mathrm{ds}} \leq 1.5 V_{\mathrm{gs}}$
- $V_{\mathrm{ds}}$ appears in data set, but not in $g_{\mathrm{m}}$ model
- monomial fit (using simple log-regression, SI units):

$$
g_{\mathrm{m}}=0.0278 I^{0.4798} L^{-0.511} W^{0.5632}
$$

## Example: monomial $g_{\mathrm{m}}$ model

- fitting (relative) error cumulative distribution plot:

- for $90 \%$ of points, fit is better than $4 \%$


## Single transistor common source amplifier

- variables: $I, L, W, R$
- saturation: $V_{\text {dsat }}+I R \leq V_{\mathrm{dd}}$
- gain $G=g_{\mathrm{m}} /\left(1 / R+g_{\mathrm{o}}\right)$
- power $P=V_{\mathrm{dd}} I$
- (unity gain) bandwidth $B=g_{\mathrm{m}} / C^{\mathrm{L}}$
- design problem:

$$
\begin{array}{ll}
\operatorname{minimize} & P \\
\text { subject to } & B \geq B^{\min }, \quad G \geq G^{\min } \\
\text { saturation }
\end{array}
$$

## Common source amplifier design via GP

- rewrite as

$$
\begin{array}{ll}
\operatorname{minimize} & P \\
\text { subject to } & B^{-1} \leq 1 / B^{\min }, \quad G^{-1} \leq 1 / G^{\min } \\
& V_{\mathrm{dsat}}+I R \leq V_{\mathrm{dd}}
\end{array}
$$

- . . . a GP, since $P$ and $B$ are monomials, and

$$
G^{-1}=\frac{1 / R+g_{\mathrm{o}}}{g_{\mathrm{m}}}
$$

is posynomial

- this is a simple problem; don't need GP sledgehammer . . .


## Current mirror opamp



- $M_{1}, M_{2}$ and $M_{3}, M_{4}$ matched pairs
- four current mirrors: $M_{8}, M_{5} ; M_{10}, M_{7} ; M_{9}, M_{3} ; M_{4}, M_{6}$


## Design problem

$$
\begin{array}{ll}
\operatorname{minimize} & P \\
\text { subject to } & B \geq B^{\min }, \quad G \geq G^{\min }, \quad A \leq A^{\max } \\
& \text { other constraints } . .
\end{array}
$$

- objective \& specifications:
- $P$ is power dissipation
$-B$ is unity gain bandwidth
- $G$ is DC gain
- $A$ is (active) area
- design variables: $L_{1}, \ldots, L_{10}, W_{1}, \ldots, W_{10}$
- given: $V_{\mathrm{dd}}, C_{\mathrm{L}}, I_{\text {ref }}$, common-mode voltage $V_{\mathrm{cm}}$
- we'll formulate as GP


## Power, bandwidth, gain, \& area

- power: $P=V_{\mathrm{dd}}\left(I_{8}+I_{5}+I_{7}+I_{10}\right)$
... posynomial
- bandwidth: $B=g_{\mathrm{m}, 2} g_{\mathrm{m}, 6} /\left(g_{\mathrm{m}, 4} C_{\mathrm{L}}\right)$
... monomial
- area: $A=W_{1} L_{1}+\cdots+W_{10} L_{10}$
... posynomial
- gain: $G=\frac{g_{\mathrm{m}, 2} g_{\mathrm{m}, 6}}{g_{\mathrm{m}, 4}\left(g_{\mathrm{o}, 6}+g_{\mathrm{o}, 7}\right)}$
$\ldots G^{-1}$ is posynomial, so $G \geq G^{\mathrm{min}}$ can be written as $G^{-1} \leq 1 / G^{\mathrm{min}}$


## Dimension, matching, and current constraints

- limits on device sizes: $L_{\text {min }} \leq L_{i} \leq L_{\max }, W_{\min } \leq W_{i}, i=1, \ldots, 10$
- differential symmetry constraints ( $M_{1}, M_{2}$ and $M_{3}, M_{4}$ matched):

$$
\begin{array}{lll}
W_{1}=W_{2}, & L_{1}=L_{2}, & I_{1}=I_{2}, \\
W_{3}=W_{4}, & L_{3}=L_{4}, & I_{3}=I_{4},
\end{array}
$$

- length \& gate overdrive voltage matched for current mirror pairs:

$$
\begin{array}{llll}
L_{5}=L_{8}, & L_{10}=L_{7}, & L_{3}=L_{9}, & L_{4}=L_{6} \\
V_{\mathrm{gov}, 5}=V_{\mathrm{gov}, 8}, & V_{\mathrm{gov}, 10}=V_{\mathrm{gov}, 7}, & V_{\mathrm{gov}, 3}=V_{\mathrm{gov}, 9}, & V_{\mathrm{gov}, 4}=V_{\mathrm{gov}, 6}
\end{array}
$$

- current relations:

$$
I_{1}=I_{3}=I_{5} / 2, \quad I_{8}=I_{\mathrm{ref}}, \quad I_{6}=I_{7}, \quad I_{9}=I_{10}
$$

## Saturation constraints

- diode connected devices $\left(M_{3}, M_{4}, M_{8}, M_{10}\right)$ automatically in saturation
- others must have $V_{\mathrm{ds}} \geq V_{\mathrm{dsat}}$ :
$-M_{7}: \quad V_{\mathrm{dsat}, 7} \leq V_{\mathrm{cm}}$
$-M_{6}: \quad V_{\mathrm{dsat}, 6}+V_{\mathrm{cm}} \leq V_{\mathrm{dd}}$
$-M_{9}: \quad V_{\mathrm{dsat}, 9}+V_{\mathrm{gs}, 10} \leq V_{\mathrm{dd}}$
$-M_{5}: \quad V_{\mathrm{ds}, 5}+V_{\mathrm{gs}, 1} \leq V_{\mathrm{cm}}$
$-M_{1} \& M_{2}: \quad V_{\mathrm{cm}}+V_{\mathrm{gs}, 3} \leq V_{\mathrm{dd}}+V_{\mathrm{th}}$
- . . . all are posynomial inequalities


## Node capacitances and non-dominant poles

- capacitances at nodes are posynomials, e.g.,

$$
C^{\mathrm{out}}=C_{\mathrm{gd}, 6}+C_{\mathrm{db}, 6}+C_{\mathrm{gd}, 7}+C_{\mathrm{db}, 7}+C_{\mathrm{L}}
$$

- non-dominant time constants are posynomials:

$$
\tau_{1}=\frac{C_{\mathrm{d} 1}}{g_{\mathrm{m}, 3}}, \quad \tau_{2}=\frac{C_{\mathrm{d} 2}}{g_{\mathrm{m}, 4}}, \quad \tau_{9}=\frac{C_{\mathrm{d} 9}}{g_{\mathrm{m}, 10}}
$$

$\left(C_{\mathrm{d} 1}, C_{\mathrm{d} 2}, C_{\mathrm{d} 9}\right.$ are node capacitances at drains of $\left.M_{1}, M_{2}, M_{9}\right)$

- to limit effect of non-dominant poles, make sum smaller than dominant time constant:

$$
\tau_{1}+\tau_{2}+\tau_{9} \leq \tau_{\mathrm{dom}}=C_{\mathrm{L}} / g_{\mathrm{m}}
$$

. . . a posynomial constraint

## Power versus bandwidth trade-off



## Joint electrical/physical design

- each device has a (physical) cell width $w$ and height $h$ for floor planning
- devices are folded into multiple fingers
- (approximate) posynomial or monomial relations link electrical variables ( $I, L, W$ ) and physical variables ( $w, h$ ), e.g.,
- cell area is at least $4 \times$ active area: $w h \geq 4 W L$
- cell aspect ratio limited to 5:1: $1 / 5 \leq w / h \leq 5$



## Slicing tree layout scheme

- vertical and horizontal slices fix relative placement of device cells
- leaves are device cells; root is bounding box



## Slicing tree constraints

- introduce width, height for each node in slicing tree
- for each vertical slice with parent $a$ and children $b, c$ add constraints

$$
w_{a}=w_{b}+w_{c}, \quad h_{a}=\max \left\{h_{b}, h_{c}\right\}
$$

- for each horizontal slice with parent $a$ and children $b, c$ add constraints

$$
w_{a}=\max \left\{w_{b}, w_{c}\right\}, \quad h_{a}=h_{b}+h_{c}
$$

- shows width and height of bounding box and each node is generalized posynomial of device cell widths, heights
- resulting GP formulation is very sparse


## Joint electrical/physical design via GP

- form one GP that includes
- electrical variables, constraints ( $I_{i}, L_{i}, W_{i}, g_{\mathrm{m}, i} \ldots$ )
- physical variables, constraints ( $w_{i}, h_{i}, w^{\text {bbox }}, h^{\text {bbox }}, \ldots$ )
- coupling constraints ( $w_{i} h_{i} \geq 4 W_{i} L_{i}, \ldots$ )
- solve it all together
- extensions: can add
- parasitic estimates
- more accurate expressions for device cell dimensions
- channels for routing


## Optimal filter implementation

simple Gm-C two-pole lowpass filter

transfer function is

$$
H(s)=\frac{1}{1+t_{1} s+t_{1} t_{2} s^{2}}, \quad t_{1}=C_{1} / g_{1}, \quad t_{2}=C_{2} / g_{2}
$$

$g_{i}$ is amplifier transconductance

## Noise analysis

- $N_{i}$ is input referred (white) amplifier input-referred voltage density
- spectral density of output noise is

$$
N(\omega)^{2}=\frac{N_{1}^{2}+\omega^{2} N_{2}^{2}}{\left(1-t_{1} t_{2} \omega^{2}\right)^{2}+t_{1}^{2} \omega^{2}}
$$

- root-mean-square output noise voltage is

$$
M=\left(\int_{0}^{\infty} N(\omega)^{2} d \omega\right)^{1 / 2}=\left(\alpha N_{1}^{2}+\beta N_{2}^{2}\right)^{1 / 2}
$$

## Amplifier and capacitor implementation models

- each amplifier has private variables $u$ (e.g., device lengths \& widths) and constraints
- transconductance $g$ is monomial in $u$; area $A^{\text {amp }}$, power $P$, input-referred noise density $N$ are posynomial in $u$
- each capacitor has private variables $v$ (e.g., physical dimensions) and constraints
- capacitance $C$ is monomial in $v$; area $A^{\text {cap }}$ is posynomial
- design variables are $u_{1}, u_{2}, v_{1}, v_{2}$


## Optimal filter implementation problem

- filter is Butterworth with frequency $\omega_{\mathrm{c}}$ :

$$
t_{1}=\sqrt{2} / \omega_{c}, \quad t_{2}=(1 / \sqrt{2}) / \omega_{c}
$$

- minimize total power of implementation, subject to area, output noise limits:

$$
\begin{array}{ll}
\operatorname{minimize} & P\left(u_{1}\right)+P\left(u_{2}\right) \\
\text { subject to } & t_{1}=\sqrt{2} / \omega_{c}, \quad t_{2}=(1 / \sqrt{2}) / \omega_{c} \\
& A^{\text {amp }}\left(u_{1}\right)+A^{\text {amp }}\left(u_{2}\right)+A^{\text {cap }}\left(v_{1}\right)+A^{\text {cap }}\left(v_{2}\right) \leq A^{\max } \\
& M=\left(\omega_{c} / 4 \sqrt{2}\right)\left(N_{1}^{2}+2 N_{2}^{2}\right)^{1 / 2} \leq M^{\max }
\end{array}
$$

- a GGP in the variables $u_{1}, u_{2}, v_{1}, v_{2}$


## Example

- Butterworth filter with $\omega_{c}=10^{8} \mathrm{rad} / \mathrm{s}$
- private variables in amplifiers: (equivalent) $L, W$
- amplifier model:

$$
\begin{aligned}
& A^{\mathrm{amp}}=W L, \quad P=2.5 \cdot 10^{-4} W / L \\
& g=4 \cdot 10^{-5} W / L, \quad N=\sqrt{7.5 \cdot 10^{-16} L / W}
\end{aligned}
$$

(based on simple model with $V_{\mathrm{dd}}=2.5, V_{\text {gov }}=0.2$ )

- private variable in capacitors is area $A^{\text {cap } ; ~} C=10^{-4} A^{\text {cap }}$
- $A^{\max }=4 \cdot 10^{-6}$


## Power versus noise trade-off



## Spiral inductor/differential resonator optimization



- loop inductor connected to (given) $R_{\mathrm{L}}$ and $C_{\mathrm{L}}$
- differential (floating) mode operation
- inductor designed to resonate at operating frequency $f$


## Design problem: differential resonator

```
maximize }\mp@subsup{R}{\textrm{T}}{
subject to }\mp@subsup{Q}{\textrm{T}}{}\geq\mp@subsup{Q}{\textrm{T}}{\mathrm{ min}},\quadA\leq\mp@subsup{A}{}{\mathrm{ max}
other constraints . . .
```

- objective \& specifications:
- $R_{\mathrm{T}}$ is tank impedance (which is real at operating frequency $f$ )
- $Q_{\mathrm{T}}$ is tank quality factor
- $A$ is area of loop inductor
- design variables: dimensions of loop inductor
- load resistance $R_{\mathrm{L}}$, load capacitance $C_{\mathrm{L}}$, frequency $f$ given
- we'll formulate as GP


## Loop inductor



- (centerline) diameter $D$
- width $W$
- outer diameter is $D+W$; area is $A=(D+W)^{2}$


## Lumped model for loop inductor



- lumped model for operation in differential mode
- impact of substrate capacitance, loss included in $R$ and $C$


## Example

- frequency range $2 \mathrm{GHz} \leq f \leq 6 \mathrm{GHz}$
- metal layer thickness $2 \mu \mathrm{~m}$, resistivity $5 \cdot 10^{-8} \Omega \mathrm{~m}$
- metal-substrate capacitance density $5 \cdot 10^{-6} \mathrm{Fm}^{2}$
- width, diameter constraints:

$$
150 \mu \mathrm{~m} \leq D \leq 600 \mu \mathrm{~m}, \quad 4 \mu \mathrm{~m} \leq W \leq 30 \mu \mathrm{~m}, \quad 10 \leq D / W \leq 100
$$

## GP models for $L, R$ and $C$

- can get exact values via EM simulation
- inductance (monomial)

$$
L=2.1 \cdot 10^{-6} D^{1.28} W^{-0.25} f^{-0.01}
$$

- resistance (posynomial)

$$
R=0.1 D W^{-1}+3 \cdot 10^{-6} D W^{-0.84} f^{0.5}+5 \cdot 10^{-9} D W^{-0.76} f^{0.75}+0.02 D W f
$$

- capacitance (posynomial):

$$
C=5 \cdot 10^{-6} D W+1 \cdot 10^{-11} D
$$

## Resonance constraint



- resonance condition: $4 \pi^{2} f^{2} L C_{\mathrm{T}}=1\left(C_{\mathrm{T}}=C+C_{\mathrm{L}}\right)$
- to handle in GP:
- impose posynomial constraint $C+C_{\mathrm{L}} \leq C_{\mathrm{T}}$
- add extra capacitance (after design) $C_{\text {extra }}=C_{\mathrm{T}}-C-C_{\mathrm{L}}$ if needed


## Tank conductance and quality factor



- tank conductance: $G_{\mathrm{T}}=\frac{1}{R_{\mathrm{T}}}=\frac{R}{4 \pi^{2} f^{2} L^{2}}+\frac{1}{R_{\mathrm{L}}} \quad$... posynomial
- inverse of tank quality factor: $\frac{1}{Q_{\mathrm{T}}}=\frac{R}{2 \pi f L}+\frac{2 \pi f L}{R_{\mathrm{L}}} \quad \ldots$ posynomial

Resonance impedance versus area trade-off


GOOZ


- loop inductor
- varactors for fine tuning
- binary weighted switching capacitors for coarse tuning
- cross coupled NMOS transistors
- tail current source


## LC oscillator design problem

```
minimize P
subject to }N\leq\mp@subsup{N}{}{\operatorname{max}},\quadA\leq\mp@subsup{A}{}{\operatorname{max}},\quadl\geq\mp@subsup{l}{}{\operatorname{min}
    other constraints . . .
```

- objective \& specifications:
- $P$ is power consumption
- $N$ is phase noise
- $A$ is area of loop inductor
$-l$ is loop gain
- given: load capacitance $C_{\mathrm{L}}$, center frequency $f$, normalized tuning range $T$
- we'll formulate as GP


## Design variables



- loop inductor dimensions $D, W$
- size of varactor $V_{\mathrm{c}}$
- size of switching capacitors $C_{\mathrm{sw}}$
- width, length of transistors $W_{\mathrm{nmos}}, L_{\mathrm{nm} \text { os }}$
- bias current $I_{\text {bias }}$


## Current source, switched capacitor, and varactor models

- $I_{\text {bias }}$ is bias current, with minimum operating voltage $V_{\text {bias }}$
- binary weighted capacitors
- $B$ is number of bits for switching capacitors
- $C_{\mathrm{sw}}$ is LSB switching capacitance; $2^{B-1} C_{\mathrm{sw}}$ is MSB switching capacitance
- varactor
- $C_{\mathrm{v}}$ is minimum varactor capacitance; $K_{\mathrm{v}} C_{\mathrm{v}}$ is maximum ( $K_{\mathrm{v}}$ is process constant)
- varactor range covers $2 \mathrm{LSB}: 2 C_{\mathrm{sw}} \leq 0.5\left(K_{\mathrm{v}}-1\right) C_{\mathrm{v}}$


## Tank capacitance

- tank capacitance is sum of $C_{\text {fix }}$ and $C_{\text {tune }}$
- fixed capacitance is sum of loop, load and transistor capacitances:

$$
C_{\mathrm{fix}}=C+0.5\left(C_{\mathrm{L}}+C_{\mathrm{gs}}+4 C_{\mathrm{gd}}+C_{\mathrm{db}}\right)
$$

- tunable capacitance is sum of switching and varactor capacitances:
- $C_{\text {tune }}$ for maximum frequency: $C_{\text {tune }}=0.5 C_{\mathrm{v}}$
- $C_{\text {tune }}$ for minimum frequency: $C_{\text {tune }}=2^{B} C_{\mathrm{sw}}+0.5 K_{\mathrm{v}} C_{\mathrm{v}}$


## Resonance frequency \& tuning

- capacitance constraint at maximum frequency: $C_{\text {fix }}+0.5 C_{\mathrm{v}} \leq C_{\mathrm{f}, \max }$
- maximum frequency: $(2 \pi f(1+T))^{2} L C_{\mathrm{f}, \max }=1$
- capacitance at center frequency: $(2 \pi f)^{2} L C_{\mathrm{f}, \mathrm{c}}=1$
- tuning range constraint:

$$
\frac{4 T}{\left(1-T^{2}\right)^{2}} C_{\mathrm{f}, \mathrm{c}} \leq C_{\mathrm{sw}}\left(2^{B}+2\right)
$$

## Power \& area

- power: $P=V_{\mathrm{dd}} I_{\mathrm{b} \text { ias }}$
- area: $A=(D+W)^{2}+2 W_{\mathrm{nmos}} L_{\mathrm{nmos}}$ (can add area of switched capacitors, varactor)


## Tank conductance \& voltage swing

- tank conductance is posynomial: $G_{\mathrm{T}}=\frac{R}{4 \pi^{2} f^{2} L^{2}}+0.5 g_{\mathrm{o}}$
- differential voltage amplitude: $V_{\mathrm{osc}}+2 V_{\mathrm{bias}} \leq 2 V_{\mathrm{dd}}, V_{\mathrm{osc}} G_{\mathrm{T}} \leq I_{\mathrm{bias}}$


## Phase noise

- thermal current noise power density of loop: $\overline{i_{\mathrm{n}, \mathrm{L}}^{2}}=\frac{4 k T R}{4 \pi^{2} f^{2} L^{2}}$
- thermal current noise power density of transistor: $\overline{\bar{i}_{\text {nmos }}^{2}}=4 k T \gamma g_{\mathrm{m}}$
- phase noise in the $1 / f^{2}$ region:

$$
N=\frac{1}{16 \pi^{2} f_{\mathrm{off}}^{2} C_{\mathrm{T}}^{2} V_{\mathrm{osc}}^{2}}\left(\overline{i_{\mathrm{n}, \mathrm{~L}}^{2}}+0.5 \overline{i_{\mathrm{nmos}}^{2}}\right)
$$

- . . . can add other noise terms


## Loop gain and start-up

- inverse of loop gain is posynomial: $1 / l=\left(g_{\mathrm{o}}+2 G\right) / g_{\mathrm{m}}$
- minimum loop gain to ensure start-up: $l \geq l^{\text {min }}$
- bias condition for quiescent operating point: $V_{\text {bias }}+V_{\mathrm{gs}}+\frac{I_{\mathrm{bias}} R}{4} \leq V_{\mathrm{dd}}$
- NMOS device models:

$$
\begin{aligned}
g_{\mathrm{m}} & =4.5 \cdot 10^{-3} W_{\text {nmos }}^{0.6} L_{\text {nmos }}^{-0.6} I_{\text {bias }}^{0.4} \\
g_{\mathrm{o}} & =2.6 \cdot 10^{-10} W_{\text {nmos }}^{0.4} L_{\text {nmos }}^{-1.4} I_{\text {bias }}^{0.6} \\
V_{\mathrm{gs}} & =0.34+1 \cdot 10^{-8} L_{\text {nmos }}^{-1}+5 \cdot 10^{2} W_{\text {nmos }}^{-0.7} L_{\text {nmos }}^{0.7} I_{\text {bias }}^{0.7}
\end{aligned}
$$

## LC oscillator example

- center frequency: $f=5 \mathrm{GHz}$
- tuning range: $T= \pm 10 \%$
- varactor tuning ratio: $K_{\mathrm{v}}=3$
- $B=3$ bits switching capacitor
- minimum loop gain: $l^{\text {min }}=2$
- load capacitance: $C_{\mathrm{L}}=200 \mathrm{fF}$
- supply voltage: $V_{\mathrm{dd}}=1.2 \mathrm{~V}$
- offset frequency for phase noise:
$f_{\text {off }}=600 \mathrm{kHz}$



## Power versus phase noise trade-off



## Monomial and Posynomial Fitting

## A basic property of posynomials

- if $f$ is a monomial, then $\log f\left(e^{y}\right)$ is affine (linear plus constant)
- if $f$ is a posynomial, then $\log f\left(e^{y}\right)$ is convex
- roughly speaking, a posynomial is convex when plotted on log-log plot
- midpoint rule for posynomial $f$ :
- let $z$ be elementwise geometric mean of $x, y$, i.e., $z_{i}=\sqrt{x_{i} y_{i}}$
- then $f(z) \leq \sqrt{f(x) f(y)}$
- a converse: if $\log \phi\left(e^{y}\right)$ is convex, then $\phi$ can be approximated as well as you like by a posynomial


## Convexity in circuit design context

- consider circuit with design variables $W_{1}, \ldots, W_{n}$ (say) \& performance measure $\phi\left(W_{1}, \ldots, W_{n}\right)$ (e.g., power, delay, area)
- two designs: $W_{i}^{(a)} \& W_{i}^{(b)}$, with performance $\phi^{(a)} \& \phi^{(b)}$
- form geometric mean compromise design with $W_{i}^{(c)}=\sqrt{W_{i}^{(a)} W_{i}^{(b)}}$, performance $\phi^{(c)}$
- if $\phi$ is generalized posynomial, then we have $\phi^{(c)} \leq \sqrt{\phi^{(a)} \phi^{(b)}}$
- this is not obvious


## Monomial/posynomial approximation: Theory

when can a function $f$ be approximated by a monomial or generalized posynomial?

- form function $F(y)=\log f\left(e^{y}\right)$
- $f$ can be approximated by a monomial if and only if $F$ is nearly affine (linear plus constant)
- $f$ can be approximated by a generalized posynomial if and only if $F$ is nearly convex


## Examples



- $\tanh (x)$ can be reasonably well fit by a monomial
- $0.5 /(1.5-x)$ can be fit by a generalized posynomial
- $(2 / \sqrt{\pi}) \int_{x}^{\infty} e^{-t^{2}} d t$ cannot be fit very well by a generalized posynomial


## What problems can be approximated by GGPs?

$$
\begin{array}{ll}
\operatorname{minimize} & f_{0}(x) \\
\text { subject to } & f_{i}(x) \leq 1, \quad i=1, \ldots, m \\
& g_{i}(x)=1, \quad i=1, \ldots, p
\end{array}
$$

- transformed objective and inequality constraint functions $F_{i}(y)=\log f_{i}\left(e^{y}\right)$ must be nearly convex
- transformed equality constraint functions $G_{i}(y)=\log G_{i}\left(e^{y}\right)$ must be nearly affine


## Monomial fitting via log-regression

find coefficient $c>0$ and exponents $a_{1}, \ldots, a_{n}$ of monomial $f$ so that

$$
f\left(x^{(i)}\right) \approx f^{(i)}, \quad i=1, \ldots, N
$$

- rewrite as

$$
\begin{aligned}
\log f\left(x^{(i)}\right) & =\log c+a_{1} \log x_{1}^{(i)}+\cdots+a_{n} \log x_{n}^{(i)} \\
& \approx \log f^{(i)}, \quad i=1, \ldots, N
\end{aligned}
$$

- use least-squares (regression) to find $\log c, a_{1}, \ldots, a_{n}$ that minimize

$$
\sum_{i=1}^{N}\left(\log c+a_{1} \log x_{1}^{(i)}+\cdots+a_{n} \log x_{n}^{(i)}-\log f^{(i)}\right)^{2}
$$

## Posynomial fitting via Gauss-Newton

find coefficients and exponents of posynomial $f$ so that

$$
f\left(x^{(i)}\right) \approx f^{(i)}, \quad i=1, \ldots, N
$$

- minimize sum of squared fractional errors

$$
\sum_{i=1}^{N}\left(\frac{f^{(i)}-f\left(x^{(i)}\right)}{f^{(i)}}\right)^{2}
$$

can be (locally) solved by Gauss-Newton method

- needs starting guess for coefficients, exponents


## Posynomial fitting example

- 1000 data points from $f(x)=e^{\left(\log x_{1}\right)^{2}+\left(\log x_{2}\right)^{2}}$ over $0.1 \leq x_{i} \leq 1$
- cumulative error distribution for 3 -, 5 -, and 7 -term posynomial fits



## A simple max-monomial fitting method

fit max-monomial

$$
f(x)=\max _{k=1, \ldots, K} f_{k}(x)
$$

$\left(f_{1}, \ldots, f_{k}\right.$ monomials) to data $x^{(i)}, f^{(i)}, i=1, \ldots, N$
simple algorithm:
repeat
for $k=1, \ldots, K$

1. find all data points $x^{(j)}$ for which $f_{k}\left(x^{(j)}\right)=f\left(x^{(j)}\right)$ (i.e., data points at which $f_{k}$ is the largest of the monomials)
2. update $f_{k}$ by carrying out monomial fit to these data

## Max-monomial fitting example

- same 1000 data points as previous example
- cumulative error distribution for 3 -, 5 -, and 7 -term max-monomial fits



## Conclusions

## Conclusions

(generalized) geometric programming

- comes up in a variety of circuit sizing contexts
- can be used to formulate a variety of problems
- admits fast, reliable solution of large-scale problems
- is good at concurrently balancing lots of coupled constraints and objectives
- is useful even when problem has discrete constraints


## Approach

- most problems don't come naturally in GP form; be prepared to reformulate and/or approximate
- GP modeling is not a "try my software" method; it requires thinking
- our approach:
- start with simple analytical models (RC, square-law, Pelgrom, ... ) to verify GP might apply
- then fit GP-compatible models to simulation or measured data
- for highest accuracy, revert to local method for final polishing
- looking for keys under street light (not where keys were lost, but lighting is good)
- forcing problems into GP-compatible form (problems aren't GPs, but solving is good)


## References

- A tutorial on geometric programming
- Digital circuit sizing via geometric programming
- Analog circuit design via geometric programming
- Convex optimization, Cambridge Univ. Press 2004
(these include hundreds of references)
available at www.stanford.edu/~ ${ }^{\text {boyd/research.html }}$


## Software

- MOSEK: www.mosek.com
- COPL-GP: (Yinyu Ye, in process of being re-worked): www.stanford.edu/~yyye/Col.html
- GPGLP: ftp://ftp.pitt.edu/dept/ie/GP/
- YALMIP: control.ee.ethz.ch/~joloef/yalmip.msql
- a simple matlab GP solver gp.m at Boyd's EE364 site

