# A Max Capacitance Constraining Approach for Power Reduction in Advanced Nodes

Chentouf Mohamed<sup>\*</sup>, Alaoui Ismaili Zine El Abidine<sup>\*\*</sup> <sup>\*</sup> Mentor Graphics Corporation/ICD Division- Rabat, Morocco <sup>\*\*</sup> ENSIAS/Telecommunications and Embedded Systems Team, University Mohammed V- Rabat, Morocco E-mail mohamed\_chentouf@mentor.com z.alaoui@um5s.net.ma

*Abstract*— The Physical Design of Very Large Scale Integrated Circuits (VLSIs) raised a lot of challenges, because of the increasing designs' complexity, the shrinking of technological nodes and the decreasing of the allocated power budget. This makes the traditional place and route (P&R) flows unable to meet timing & power requirements. To resolve such critical challenges, new P&R algorithms and flows need to be developed to get the best possible results.

In this paper, we will evaluate the impact of the max capacitance constraints' variation on the power reduction capabilities, and we will examine the quality of the traditional methodology of max capacitance constraint definition. Experimental results proved that the power reduction gain may be improved by applying a new method of Max Capacitance Constraints (MCC) definition. The difference in power gain between the default and the new method reaches 8% in our experiments.

## I. INTRODUCTION

According to Moore's law [1], the integrated circuits (ICs) are continually shrinking with an increase in operating frequency and a decrease in power supply voltage. To accommodate such law, new developments and enhancements in design methodologies, production materials, processing technologies and EDA tools are made. Even with all the progresses made, the power dissipation remains a major obstacle and a differentiating factor of successful Application Specific Integrated Circuits (ASICs) [2]. To tackle this obstacle, many new power reduction techniques are developed, each one tries to approach the problem from a different angle, some in placement, others in clock tree synthesis (CTS), or routing phases. The most used techniques in the backend flow are: sizing/spacing, buffers/inverters insertion, gates equivalent pins reordering, logic remapping, critical nets re-routing, use of non-default routing rules and High Voltage Threshold (HVT) cells.

The power dissipated in an Integrated Circuit (IC) can be divided into two main branches: static and dynamic. The dynamic power dissipation is mainly due to the switching current from charging and discharging parasitic capacitances and to short-circuit current induced when both n-channel and p-channel transistors are momentarily on at the same time, while static power dissipation is due to leakage and subthreshold currents. [3]

During physical design implementation, powerful Electronic Design Automation (EDA) tools are used to help the designer through this critical phase of ASIC development, an example of such tools is Nitro SoC of Mentor Graphics. These tools are updated and enhanced continuously to support at best the new fabrication rules provided by the foundries. Nitro-SoC is a certified EDA tool that supports and honors all the design rules of technological nodes up to 7nm.

One of the important user inputs is the timing constraints file which defines clocks characteristics such as clock period and aspect-ratio, the inputs/outputs delays, max transition and max capacitance. Nitro SoC optimizer uses these constraints in addition to the timing library to determine and fix the timing and Electrical Design Rules (EDR) violations by applying optimization techniques.

In the past, designers were driven by the fastest implementation, especially in high-performance circuits (HPCs) which were associated with high-power dissipation. Nowadays, in addition to the circuit speed, the energy efficiency is a key factor to choose the right implementation [4].

To address this concern of power dissipation, many considerations are taken into account in all the stages of the design development cycle from specifications to mask generation. Examples of such techniques are Dynamic Voltage and Frequency Scaling (DVFS) [5], Parallel Architecture [6], Clock gating [7], and Power gating [8].

The remainder of this work is organized as follows, Section 2 presents some basic concepts of capacitance's physics, it describes the sources of the parasitic capacitances seen in CMOS technologies and how it could be estimated. Section 3 provides a case study where we show the benefits of choosing the right MCC value before power optimization. Section 4 generalizes the study on a wide variety designs with different topologies and technologies. Finally, Section 5 draws the conclusion.

## II. PARASITIC CAPACITANCE IN CMOS TECHNOLOGY

The main cause of the power dissipation in advanced CMOS nodes is the parasitic capacitances which is a major constraint for circuits' performances and cannot be ignored anymore. Parasitic capacitances are due principally to the following: interconnects parasitic capacitance [9] and transistors parasitic capacitance [3]. Many works have been devoted to model and estimate the parasitic capacitance in order to accurately predict the circuit delay and power.

Transistor parasitic capacitance as shown in Figure 1, can be divided into different components [3]:

We have Junction Capacitance ( $C_{BD}$  and  $C_{BS}$ ).

Overlap Capacitance C<sub>GSOV</sub> and C<sub>GDOV</sub>.

Gate Capacitances C<sub>GS</sub>, C<sub>GD</sub> and C<sub>GB</sub>.



Figure 1. Different parasitic capacitances of a MOS transistor

The interconnect capacitance also is classified into three fundamental parts, as shown in Fig. 2.

1) Plate capacitance: between two parallel metal surfaces [10].

2) Fringe capacitance: from the sidewall of the wire to another perpendicular surface, e.g., the ground plate [10].

3) Terminal capacitance: from the corner of the wire to other metal surfaces [10].



Figure 2. Different parasitic capacitances seen in circuit interconnect [10]

This parasitic capacitances are the major contributor in dynamic power dissipation, which is caused by the switching activity of the circuit. A higher operating frequency leads to more frequent switching activities in the circuit and results in the increases of the power dissipation, as demonstrated in [11], the dynamic power due to the switching current of a CMOS gate (Psw) can be estimated by equation 1:

(1)  $P_{sw} = S_w f C_L V^2$ 

Where  $S_w$  is the Switching activity of the input,

F is the frequency of operation,

C<sub>L</sub> is the load parasitic capacitance,

V is the voltage swing across the capacitor.

From equation1, to reduce the power dissipation of a gate we can either reduce the switching activity or reduce the parasitic capacitance.

The most known circuit level techniques to reduce dynamic power are:

Gate sizing ([12], [13]), it consists of substituting big cells that are in subcritical path and which has big parasitic capacitance values, by the smallest gates that satisfies the delay requirement with identical logical functionality. Such technique is widely used in the industry for timing, area [14] and power [12] optimization.

Equivalent pin reordering, it involves connecting the input with high capacitance to the net with low switching activity, since most combinational digital gates found in a cell library have input pins that are equivalent (Ex: ANDs, ORs, XORs). Logically equivalent pins may not have identical circuit characteristics, which means that the pins have different delay or power consumption. Such property can be exploited for low power design [15].

Nets Re-routing, this technique tries to re-route nets that have big parasitic capacitances in low congested areas to reduce the parasitic capacitance caused by neighboring wires.

Use of HVT cells, by using such cells the amount of charges stored into the parasitic capacitances of the transistors is reduced, and hence the dissipated power.

# III. MAX CAPACITANCE VARIATION IMPACT ON POWER OPTIMIZATION (CASE STUDY):

In the advanced technology designs with a very high density and more than 10 metal layers, the parasitic capacitance becomes a limiting factor for speed and power consumption. Researchers in power reduction during physical designs have focused most of their work on finding and improving techniques at the circuit level (gate level) by adopting a bottom-up methodology ([5], [6], and [7]), which means that they prove the effectiveness of a technique on a small circuit (few gates) and then they generalize it. In our study, we will explore the constraining dimension which is a user input that drive physical design tools, and we will examine the impact on tool's power reduction capabilities to see if power could be reduced by applying the right constraint (Max Capacitance in our case).

The max capacitance constraint of a cell is the max load that a cell can drive. Its default value is specified in the library file, this value is calculated during cell characterization phase using the cell's SPICE models. Users can also impose new Max capacitance constraint. In this case the P&R tools use the most pessimistic value between user defined and library values.

An output of a cell cannot connect to a load that has a parasitic capacitance bigger than the maximum capacitance defined in the lib. The optimizer uses these constraint values to detect violations and fix them, it also uses them to cost its solutions when fixing timing or power. A solution that causes a max capacitance or a max transition violation will be rejected. So if the max capacitance value is so stressed many solutions will be rejected, since they will cause DRC violations. In the same way, if the Max capacitance is so relaxed the optimizer will accept solutions that degrade capacitance and power will be impacted. So the optimal max capacitance value should be carefully chosen before running any optimization to achieve the optimal power reduction results.

Our first motivational design (Figure 3) is a design with the following characteristics:

- Technology: 180nm
- Number of instances: 2.5 Million
- Number of Macros: 182
- Number of Modes: 02
- Number of Corners: 12
- Area (sq-micro (e-12)): 8.74872e+07
- Utilization: 51.65%
- Max clock frequency: 500 MHz
- Number of clocks: 67
- Number of layers: 9
- Layers Resistance (kΩ): [8.36614e-05 1.93917e-05]
- Layers Capacitance (ff): [2.8002e-05 4.2146e-05]
- Design Stage: PreCTS

We developed a flow (Flow 1) that varies the Max Capacitance Constraint (MCC) and measure the power improvement after its optimization. First, we set the range of MCC values to explore between 0 and MCC high (MCC<sub>H</sub>). In our example we have chosen MCC<sub>H</sub> 5 times the default MCC defined in the library (MCC<sub>d</sub>). Then we load the design database, which consists of the netlist, the timing and technology library files, and the timing constraints. After that we enable the power in all the design's corners and we apply the max capacitance constraint (MCC) on the design. We call a pass of power optimization. And finally we measure the power reduction for the specific MCC applied.



Figure 3. Testcase1: Schematic and Layout Views in Nitro SoC P&R

**Flow 1**: Measure MCC variation Impact On Power Reduction capabilities.

**1** For Cap  $\in \{0, ..., 5*MCC_d\}$  do

- 2 Read Design Database
- **3** Enable power in all corners
- 4 Set MCC Cap Value
- 5 Measure power (Initial value)
- 6 Optimize Power
- 7 Measure power (Final value)
- 8 End for

We applied Flow 1 on TestCase1 and we measured the power improvement for each MCC value. The graph in Figure 4 summarizes the results. The x axis represents the de-rate factor which is the value multiplied by the default  $MCC_d$  (MCC = de-rate \*  $MCC_d$ ), y1 axis represents the Power value after optimization and y2 axis represent the impact on DRC. From the graph, we can see that the amount of power reduction that can be achieved after optimization depends on the specified MCC and that it's optimal when the de-rating factor (DR) is equal to 1.2, this can be justified by the fact that most circuit optimization techniques work on cells (swapping, upsizing, downsizing ...), and by giving the right constraints we can drive the optimizer to optimize more targets and hence to achieve good results.



Figure 4. Power reduction in function of MCC derate applied on Testcase1

#### IV. EXPERIMENTAL RESULTS:

The motivational example presented in Section IV gives evidences that default MCC value is not optimal for the power optimization and proves the existence of another MCC range where power optimization gives the best power reduction.

We applied Flow1 using Mentor Graphics Nitro SoC P&R tool on a wide variety of designs with different sizes and technological nodes. We reported for each design, the power value before optimization, the default power reduction gain with default MCC and the new power reduction value with the de-rated MCC (1.1 in our exercise). We can notice that the difference in power gain may attain 8.5% in some cases (see Table1 and Figure5) which is a very encouraging gain, especially in such competitive domain where even 1% gain represents a differentiating factor.

|            |               | Default DR (DR=1) |            |             | New DR (DR=1.1) |            |             |                 |
|------------|---------------|-------------------|------------|-------------|-----------------|------------|-------------|-----------------|
|            | Power Default | Power reduction   | Power Gain | DRC viol    | Power reduction | Power Gain | DRC viol    | Gain Difference |
| TestCase1  | 460.90        | 524.14            | -6.42      | -8932131.00 | 441.92          | 2.10       | -8969173.00 | 8.52            |
| TestCase2  | 405.82        | 444.03            | -4.50      | -8977922.00 | 390.88          | 1.88       | -9013279.00 | 6.37            |
| TestCase3  | 2282.52       | 2175.62           | 2.40       | -185710.00  | 2130.61         | 3.44       | -447937.00  | 1.04            |
| TestCase4  | 1003.64       | 929.45            | 3.84       | -13894.00   | 920.90          | 4.30       | -13687.00   | 0.46            |
| TestCase5  | 1122.31       | 1012.74           | 5.13       | 0.00        | 1005.89         | 5.47       | 0.00        | 0.34            |
| TestCase6  | 3146.87       | 2846.71           | 5.01       | -92885.00   | 2830.39         | 5.29       | -88266.00   | 0.29            |
| TestCase7  | 3735.20       | 3525.07           | 2.89       | -1834868.00 | 3505.00         | 3.18       | -1835921.00 | 0.29            |
| TestCase8  | 3843.85       | 3487.18           | 4.87       | -27934.00   | 3469.58         | 5.12       | -37140.00   | 0.25            |
| TestCase9  | 2079.20       | 2005.17           | 1.81       | -90027.00   | 1997.13         | 2.01       | -87907.00   | 0.20            |
| TestCase10 | 622.49        | 625.69            | -0.26      | -19955.00   | 623.48          | -0.08      | -29246.00   | 0.18            |
| TestCase11 | 3463.78       | 3230.13           | 3.49       | 0.00        | 3219.27         | 3.66       | 0.00        | 0.17            |
| TestCase12 | 1726.45       | 1799.91           | -2.08      | -104322.00  | 1795.85         | -1.97      | -105085.00  | 0.11            |
| TestCase13 | 8207.96       | 7549.45           | 4.18       | -1292659.00 | 7532.88         | 4.29       | -2005949.00 | 0.11            |
| TestCase14 | 2416.87       | 2316.74           | 2.12       | -3342.00    | 2311.82         | 2.22       | -3032.00    | 0.11            |
| TestCase15 | 2435.34       | 2282.53           | 3.24       | -7386.00    | 2277.90         | 3.34       | -7550.00    | 0.10            |
| TestCase16 | 1323.71       | 1168.00           | 6.25       | -658.00     | 1165.67         | 6.35       | -609.00     | 0.10            |
| TestCase17 | 4657.12       | 4117.56           | 6.15       | -85053.00   | 4111.06         | 6.23       | -83710.00   | 0.08            |
| TestCase18 | 886.61        | 875.04            | 0.66       | -95964.00   | 874.10          | 0.71       | -127943.00  | 0.05            |
| TestCase19 | 4573.36       | 4366.94           | 2.31       | -14629.00   | 4362.41         | 2.36       | -14933.00   | 0.05            |

TABLE1: POWER DISSIPATION COMPARISON BETWEEN DEFAULT MCC AND DE-RATED MCC (DR=1.1) ON SEVERAL DESIGNS



Figure 5. Power Reduction Gain Comparison Btw Default & New MCC

### CONCLUSION:

In this paper, we evaluated the impact of max capacitance constraint on the power optimization in the physical design phase of ASICs. We proved that for the same design and with the same optimization techniques, the power reduction could be improved if the design is constrained with a good MCC value. We also showed that by adopting this method of Max Capacitance evaluation, the gain in power reduction may attain ~8.5%.

The flow presented in section IV was applied on more than 100 designs, and the obtained results confirmed that careful attention should be taken when constraining a design to drive Physical Design tools such as Nitro SoC of Mentor Graphics, and get the best power reduction.

#### REFERENCES

- [1] Hiroshi Iwai, "End of the scaling theory and Moore's law", Junction Technology (IWJT), 2016 16th International Workshop
- [2] A. Chin; S. R. McAlister, "The power of functional scaling: beyond the power consumption challenge and the scaling roadmap", IEEE Circuits and Devices Magazine, Year: 2005, Volume: 21, Issue: 1, Pages: 27 - 35
- [3] Michael John Sebastian Smith, "Application-Specific Integrated Circuits", 9th printing, 2001, page 816
- [4] M. Horowitz, W. Dally, "How scaling will change processor architecture", ISSCC Tech. Dig. 2004, pp. 132-133.
- [5] Dongsheng Ma; Rajdeep Bondade, "Enabling Power-Efficient DVFS Operations on Silicon", IEEE Circuits and Systems Magazine (Volume: 10, Issue: 1, First Quarter 2010), Pages: 14 – 30.

- [6] Hyun Suk Choi; Jong Hyun Choi; Jong Tae Kim, "Low-Power AES Design Using Parallel Architecture", Convergence and Hybrid Information Technology, 2008. ICHIT '08. International Conference.
- [7] Vazgen Melikyan; Eduard Babayan; Anush Melikyan; Davit Babayan; Poghos Petrosyan; Edvard Mkrtchyan, "Clock gating and multi-VTH low power design methods based on 32/28 nm ORCA processor", East-West Design & Test Symposium (EWDTS), 2015 IEEE.
- [8] Byoung-Kwan Jeon; Seong-Kwan Hong; Oh-Kyong Kwon, "A low-power 10-bit single-slope ADC using power gating and multiclocks for CMOS image sensors", SoC Design Conference (ISOCC), 2016 International
- [9] Weibing Gong; Wenjian Yu; Yongqiang Lü; Qiming Tang; Qiang Zhou; Yici Cai, "A parasitic extraction method of VLSI interconnects for pre-route timing analysis", 2010 International Conference on Communications, Circuits and Systems (ICCCAS), Year: 2010, Pages: 871 – 875
- [10] Yu Cao, "Predictive Technology Model for Robust Nanoelectronic Design" Library of Congress Control Number: 2011931530
- [11] Gary K. Yeap, "Practical low power digital VLSI design", 1998, Pages 175-178.
- [12] Azam Beg, "Automating the CMOS Gate Sizing for Reduced Power/Energy", Frontiers of Information Technology (FIT), 2014 12th International Conference.
- [13] Vinicius dos S. Livramento; Chrystian Guth; Jose Luis Guntzel; Marcelo O. Johann, "Evaluating the impact of slew on delay and power of neighboring gates in discrete gate sizing," in 2012 IEEE 3rd Latin American Symposium on Circuits and Systems (LASCAS).
- [14] Gracieli Posser; Guilherme Flach; Gustavo Wilke; Ricardo Reis, "Gate Sizing Minimizing Delay and Area", VLSI (ISVLSI), 2011 IEEE Computer Society Annual Symposium.
- [15] Chih-Wei Chang; Ming-Fu Hsiao; Bo Hu; Kai Wang; M. Marek-Sadowska; Chung-Kuan Cheng; Sao-Jie Chen, "Fast postplacement optimization using functional symmetries", IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Volume: 23, Issue: 1, Jan 2004), Pages: 102 118.