IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS

# An On-Chip NBTI Sensor for Measuring pMOS Threshold Voltage Degradation

John Keane, Student Member, IEEE, Tae-Hyoung Kim, Student Member, IEEE, and Chris H. Kim, Member, IEEE

Abstract-Negative bias temperature instability (NBTI) is one of the most critical device reliability issues in sub-130 nm CMOS processes. In order to better understand the characteristics of this mechanism, accurate and efficient means of measuring its effects must be explored. In this work, we describe an on-chip NBTI degradation sensor using a delay-locked loop (DLL), in which the increase in pMOS threshold voltage due to NBTI stress is translated into a control voltage shift in the DLL for high sensing gain. The proposed sensor is capable of supporting both DC and AC stress modes. Measurements from a test chip fabricated in a 130 nm bulk CMOS process show an average gain of  $10 \times$  in the operating range of interest, with measurement times in tens of microseconds possible for minimal unwanted threshold voltage recovery. NBTI degradation readings across a range of operating conditions are presented to demonstrate the flexibility of this system.

*Index Terms*—Delay locked loop (DLL), negative bias temperature instability (NBTI), reliability.

## I. INTRODUCTION

ESIGNING reliable circuits has become increasingly complex in aggressively scaled CMOS technologies. Several reliability issues that have been recognized for some time are now more problematic as oxide thicknesses are scaled towards 1 nm, voltage margins are reduced, and more devices are placed on a chip. One complexity that has recently attracted a great deal of attention is negative bias temperature instability (NBTI) [1]–[25]. As the oxide thicknesses of pMOS transistors are reduced and operating temperatures tend to increase, the shift in the threshold voltage caused by NBTI can become a dominant limiting factor in device lifetimes. This issue is particularly true in processes incorporating nitrogen into the gate oxides in order to reduce gate leakage and boron penetration, as increased nitrogen content has been shown to accelerate NBTI degradation [18], [21]. A growing body of research is devoted to further understanding this mechanism in order to equip circuit designers with the knowledge and tools they need to create robust systems in CMOS processes experiencing severe NBTI degradation.

NBTI is characterized by a positive shift in the absolute value of the pMOS threshold voltage ( $|V_{tp}|$ ), which occurs when a device is biased in strong inversion. This threshold shift is generally attributed to the breaking of Si-H bonds at the oxide inter-

The authors are with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455 USA (e-mail: jkeane@ece. umn.edu).

Digital Object Identifier 10.1109/TVLSI.2009.2017751

Fig. 1. Cross section of a pMOS device under (a) NBTI stress and (b) in recovery mode.

face by holes in the pMOS inversion layer. The bond breaking process creates positively charged interface traps which, in combination with new or existing traps within the oxide, lead to the increase in  $V_{tp}$  [see Figs. 1(a) and 12] [1], [2], [12], [15]. When a device is turned off, it enters the "recovery" or "passivation" phase, where the freed hydrogen species diffuse back towards the oxide/silicon interface and anneal the broken Si-H bonds, thereby reducing  $|V_{tp}|$  [see Figs. 1(b) and 14] [3], [5], [6]. Some authors have claimed that the hole detrapping effect in the oxide bulk is the primary origin of this recovery [1], [2], [17]. In any case, the reduction in  $|V_{tp}|$  observed when stress conditions are removed leads to a significantly longer device lifetime than would be predicted by DC stress experiments [3]-[5]. While NBTI recovery results in a longer time to failure, it also complicates the process of characterizing this mechanism, as stress conditions must be periodically removed throughout the stress experiments to extract the device parameter(s) of interest in most measurement schemes. Therefore, unwanted recovery occurs during measurement periods, which results in overly optimistic results.

NBTI leads to a host of problems in circuit performance over time. For example, as a CMOS system ages, certain logic paths that were not critical at design time may experience more significant NBTI stress, thereby becoming critical, and preventing proper timing closure [20]. This mechanism also causes degradation in the static noise margin of SRAM cells [13], and lowers the maximum operating frequency of aged circuits [6], [7], [20]. In order for circuit designers to mitigate these effects without



Manuscript received March 28, 2008; revised July 07, 2008. This work was supported in part by Intel and IBM.

using costly over-design methods, such as liberally up-sizing stressed devices or using large guard bands in the system clock, accurate predictive models must be developed and incorporated into their suite of design tools. Recent efforts have been devoted to developing such models [3], [6], [7], [15], [20] which must be solidly corroborated by reliable hardware measurements in order to be effective.

Much of the work measuring NBTI degradation in hardware has involved highly invasive measurements, or techniques requiring specialized equipment. In this paper, we circumvent those issues with our proposed on-chip sensor that directly translates the pMOS threshold voltage degradation caused by NBTI into a shift in the control voltage of a delay-locked loop (DLL), which can be readily monitored with standard off-chip equipment. Our design is capable of measuring the effects of both DC and AC stress (although in the hardware implementation we focused on DC stress measurements), and taking measurements within tens of microseconds in order to avoid unwanted device recovery. This circuit enables us to investigate NBTI under a variety of stress conditions, its frequency dependency, and threshold voltage recovery behavior when stress is removed, which are all currently topics of interest in the device reliability community.

#### **II. PREVIOUS NBTI MEASUREMENT TECHNIQUES**

As stated above, many previously proposed NBTI measurement techniques involve highly invasive experimental setups that require specialized equipment and/or access to individual transistors under test. In this section, we will summarize a number of these and other less invasive methods, along with their benefits and drawbacks.

Chen *et al.* were the first to report the partial recovery of pMOS transistor strength when stress conditions are removed, which results in longer device lifetimes [5]. Those authors used an improved direct-current current-voltage (DCIV) measurement technique to monitor the formation of interface traps  $(N_{\rm it})$ . This method allowed them to observe a correlation between  $\Delta V_{\rm tp}$  and  $\Delta N_{\rm it}$ , but required the sensitive monitoring of base and collector currents in a gate-controlled parasitic BJT of a MOSFET.

Denais *et al.* proposed an on-the-fly measurement technique in order to avoid the recovery inherent in most measurement setups [8]. In this method, the stress voltage is kept quasi-constant, and the linear drain current  $(I_{D,lin})$  of the device under test (DUT) is periodically measured to monitor device degradation. In [9], the on-the-fly technique was extended to characterize the recovery after stress conditions are removed. However, the on-the-fly method relies on a translation of  $\Delta I_{D,lin}$  into  $\Delta V_{tp}$ which requires some approximations, and the authors of [17] state that this method underestimates the total degradation due to a slow initial measurement which causes unrecorded degradation as well. Additionally, the time required for each measurement is typically in the range of milliseconds, and it is difficult to get an accurate reading of  $\Delta I_{D,lin}$  at the stress voltage level, all of which could make on-the-fly results less reliable [12].

Aota *et al.* proposed a measurement method to minimize unwanted device recovery during the measurement period by using fully-automatic wafer probing [10]. The uncontrollable relaxation time in this technique was still 5 ms, though. Subsequent research has shown that NBTI recovery can become significant in 1 ms or less after the removal of stress conditions [14], [17], [24], necessitating measurement techniques with even shorter reaction times. More recently, Fernández *et al.* proposed on-chip circuits for the reliable measurement of device degradation due to AC NBTI stress up to the gigahertz range [11]. In this work, I-V curves of single transistors under test, as well as voltage transfer curves of inverters placed under stress, were used to extract  $\Delta V_{\rm tp}$ . No mention is made of the time required for each reading, and such measurements are not typically made in the timescale needed to avoid unwanted NBTI recovery. The authors also use on-the-fly techniques which we covered earlier in this section.

In [16], Kim et al. introduced an aging monitor which is capable of taking fast and precise frequency degradation measurements by detecting the beat frequency of a pair of simple inverter-based ring oscillators (ROSCs), where one is placed under accelerated stress conditions by raising the supply voltage. However, in this circuit, only half of the pMOS devices in the ROSC are stressed at one time, and frequency degradation is the measured parameter. That frequency change cannot easily be mapped to a threshold degradation due solely to NBTI, since simply raising the supply will cause a combination of accelerated NBTI, positive bias temperature instability in nMOS devices, hot-carrier stress, and possibly other degradation mechanisms to impact circuit performance simultaneously. Ketchen et al. also used the beat frequency idea with a new ROSC stage designed specifically to facilitate "pure" NBTI stress ( $V_s = V_d$ ;  $V_q \leq 0$  V) [19]. The authors of this paper did not mention any attempt to execute fast measurements to avoid recovery, and it may be difficult to know the precise stress  $V_{gs}$ in their setup if the gate current in the DUTs is comparable to other leakage sources in the ROSC. The authors rely on the fact that the DUTs' drain and source terminals are grounded through all of these other leakage paths when VCC = GND =0 V, which may not be accurate, particularly with a large stress voltage. Next, Shen et al. used a 100 ns I-V sweep technique to monitor NBTI degradation, and demonstrated the fundamental differences in NBTI that are observed with ultra-high speed measurements [17]. This technique will experience drawbacks associated with high speed off-chip device probing, though.

Finally, Karl *et al.* proposed a setup to monitor the frequency of a ROSC with a pMOS header that is stressed and then biased in subthreshold during measurement periods for high  $V_{tp}$  sensitivity [22]. This work relies on a complex mathematical model to map temperature and threshold voltage variation to the ROSC frequency after extensive calibration. Also, the authors cite a  $\sim 450 \times$  area savings versus that quoted in our previous publication [23] without including the area overhead of their full design, while our number included an optional replica DLL for constant bias generation, additional circuits built in for maximum testing flexibility, and decoupling capacitance. Note that if desired, our area could be reduced by cutting down the length of our VCDL, lowering the amount of flexibility we built in via variables like adjustable delay line lengths, and reducing the size of our loop filter. The latter structure was liberally sized up to lower the



Fig. 2. (a) Block diagram of the proposed NBTI degradation sensor. (d) Loop bandwidth calculation and comparison to the reference frequency.

loop bandwidth (see Section III-A), but our system would remain within the guideline for stability with a  $\sim 75 \times$  reduction in loop filter capacitance. This would lead to an area difference of only  $\sim 62 \times$  when considering one of our complete sensors without further area optimization, compared to the portion of Karl's design used in their comparison. However, area is not the critical design constraint in a test chip meant to provide valuable process characterization data. Karl's design is claimed to be an "in-situ" sensor, but may be difficult to include in actual products due to the calibration steps needed, and the range of biases required.

Our proposed sensor addresses many of the issues raised here, including the need for fast measurements, avoiding high speed I/O signals or device probing, and isolating NBTI stress in the DUTs since only the gate voltage of the DUTs is raised to the stress level, and all other voltage drops in the DLL delay stages containing the DUTs are limited to less than [VCC]. Compared with other recently proposed NBTI sensors using circuit methods [16], [22], this scheme has the advantage of providing a direct measure of the pMOS threshold degradation, with no need to make assumptions about other degradation mechanisms, or relying on modeling equations and simulation data which are subject to many sources of error.

### III. PROPOSED NBTI SENSOR DESIGN

#### A. System-Level Overview

A block diagram of the measurement system is displayed in Fig. 2(a). This NBTI sensor is primarily composed of an analog DLL and an on-chip reference clock (CLK) generator. The DLL contains a voltage controlled delay line (VCDL), and the circuitry needed to adjust the control voltage  $(V_{\text{control}})$  which locks the VCDL output into phase with the delayed reference CLK. This control unit includes a phase comparator, a charge pump, and a loop filter. A startup control circuit is also included to prevent false locking and improve lock times by resetting the phase comparator when the DLL is shut down, and pinning  $V_{\text{control}}$  to a bias in middle of its locking range  $(V_{\text{bias}_{\text{initial}}})$  until the DLL is switched on with the MEASSTRESS signal. The VCDL consists of a chain of delay stages that are placed under NBTI stress, in series with a number of unstressed stages. The latter number can be adjusted with a MUX during calibration to move  $V_{\text{control}}$ into the optimal gain region. Note that an adjustable delay line was also added in the reference CLK path, so this DLL locks the VCDL output directly into phase with the reference input to the



Fig. 3. (a) Stressed stages total delay versus  $V_{\rm const}$  (see Fig. 6). (b) Unstressed stages total delay versus  $V_{\rm control}$  for a varied number of unstressed stages.

phase comparator (i.e., there is not a  $360^{\circ}$  phase difference between the reference and the VCDL output as is the case in other common designs [26], [27]).

During stress periods, the DLL is deactivated while stress conditions are applied to each DUT in the stressed stages. When a measurement is started, stress conditions are removed and the DLL is activated so that  $V_{\text{control}}$  can settle and be recorded off chip. The stressed stages are biased by the constant  $V_{\rm const}$ during measurements, so this portion of the VCDL will slow down after NBTI stressing due to the  $V_{tp}$  degradation, whose impact on delay is directly proportional to that of an increasing  $V_{\rm const}$  bias [see Section III-C, Fig. 3(a)]. Note that  $V_{\rm const}$  was supplied from off-chip, but can also be driven by an unstressed replica DLL [23], or another on-chip bias generator. The unstressed stages are biased by  $V_{\text{control}}$ , which decreases to speed these buffers up [see Fig. 3(b)] during measurements to compensate for the slower stressed stages. This system was designed to achieve a maximum gain of over  $15 \times$  from the increase in  $|V_{tp}|$ to the corresponding decrease in  $V_{\text{control}}$ , which is characterized in a simple calibration step, as covered in Section III-C.



Fig. 4. (a) Unstressed stages are adjustable capacitor-loaded buffers. (b) The phase comparator [26] with added ENABLE signal. (c) The charge pump [26].

DLLs are often preferred over phase locked loops in applications such as frequency synthesis for a variety of reasons, including their unconditional stability when the loop bandwidth  $(\omega_N)$  is held a decade or more below the operating frequency [29]. The DLL used in the proposed NBTI sensor is not employed in a clocking network where a high loop bandwidth might be required, so we designed to stay well below this one decade guideline with a ratio of ~ 0.00127 when the reference CLK frequency was 125 MHz, as seen in Fig. 2(b). Due to the fact that the DLL acts as a single-pole low-pass filter with a cutoff frequency of  $\omega_N$  to shifts in the reference CLK, setting a low value for this parameter also enhances the system's jitter performance [27], while still allowing locking at a rate proportional to  $\omega_{\text{REF}}/\omega_N$  cycles [29]. This results in a ~7 to ~ 20  $\mu$ s locking time in our design.

## B. Selected System Components

Selected system components are pictured in Fig. 4. The unstressed stages are capacitor-loaded delay buffers [see Fig. 4(a)]. As  $V_{\text{control}}$  drops, the output load of each stage is lowered due to the smaller  $V_{\rm gs}$  value on the Mncap transistor, thereby decreasing the line delay. The effectiveness of lowering this value as a means of decreasing the delay rapidly decreases as it approaches the threshold voltage of Mncap [see Fig. 3(b)]. Next, the phase comparator [see Fig. 4(b)] asserts equal short duration output pulses for in-phase signals to avoid a dead-band region [26]. An ENABLE signal was added to this design to reset the comparator during stress periods. In order to take advantage of the short-duration pulses created by the phase comparator for in-phase signals when the DLL is active, the charge pump [see Fig. 4(c)] output ("OUT") does not change when both input signals from the phase comparator are asserted for equal periods [26]. Note that in order to avoid large drain-source voltages in the devices adjacent to OUT, which could lead to additional IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS



Fig. 5. (a) Simulated translation of  $\Delta V_{\rm tp}$  to  $\Delta V_{\rm control}$  for equivalent VCDL delay. (b) The  $\Delta V_{\rm tp}$  to  $\Delta V_{\rm control}$  gain plots corresponding to part (a). (c) Simple equation used to calculate the system gain at each point. Note that during measurements when we have selected a particular number of unstressed stages to use, the gain plot of interest will be system gain versus  $V_{\rm control}$  [see Fig. 10(b)].



Fig. 6. (a) Stressed stage buffer design. (b)  $\Delta V_{\rm tp}$  of Mp in the stressed buffers is directly proportional to  $\Delta V_{\rm const}$ . This relationship allows us to calibrate the sensor, as shown in Fig. 10.

unwanted shifts in  $V_{\rm control}$ , and therefore a phase offset in the VCDL output, it is best to operate with this value centered at ~ VCC/2. In the 1.2 V technology used for this implementation, we designed to stay in the 450–850 mV range.

### C. System Gain and Calibration

Our design was tuned to attain maximal gain from  $\Delta V_{\rm tp}$  in the stressed stages to the decrease in  $V_{\rm control}$ , since the latter value will be measured off-chip and translated into the threshold shift. The simulation results in Fig. 5(a) illustrate the translation of  $\Delta V_{\rm tp}$  into  $\Delta V_{\rm control}$  for a varied number of unstressed stages at an equivalent nominal delay point. The corresponding system gain plot in Fig. 5(b) is created with the simple equation in Fig. 5(c). Note that during measurements, the gain plot of interest will be gain versus  $V_{\rm control}$ , when we have selected one particular number of unstressed stages to use [see Fig. 10(b)].

As illustrated in Fig. 6(b),  $\Delta V_{\rm tp}$  (in this case, a shift in the nominal threshold voltage value, VTH0, in the BSIM parameter file) is directly proportional to  $\Delta V_{\rm const}$  in the buffer structure used for the stressed stages [see Fig. 6(a)]. That is,  $V_{\rm tp}$  must be changed by the same amount as  $V_{\rm const}$ , with the other held constant, in order to cause an equivalent stage delay shift. This relationship can be derived from the standard saturation current equation, where we see that the two values of interest have the

same effect on the drain current, and hence, the buffer delay as well since delay is proportional to  $C_{\text{load}} * \text{VCC}/I_D$ , where

$$I_D = \frac{1}{2}\mu_p C_{\rm OX} \left(\frac{W}{L}\right) \left(\left[\text{VCC} - V_{\rm const}\right] - \left|V_{\rm tp}\right|\right)^{\alpha}$$

Therefore, the system gain from  $\Delta V_{\text{tp}}$  to  $\Delta V_{\text{control}}$  can be checked during calibration by sweeping  $V_{\text{const}}$ , while recording the corresponding change in  $V_{\text{control}}$ , as described next. Note that this delay stage could also be used in the Silicon Odometer framework [16] in order to take advantage of that system's precision and digital nature, while isolating NBTI stress in the delay stages' pMOS header devices, rather than simply raising the supply voltage of standard inverters to get a general stress measurement.

In order to calibrate this sensor, we first estimate the maximum expected  $V_{\rm tp}$  degradation based on published data. In the 130 nm process used for this work, we estimated a maximum degradation of ~20-30 mV with 2.4 V stress [5], [11], [24], and this number was then refined based on our initial measurements. Next we determine a range of acceptable  $V_{\rm const}$  bias points in the saturation region which will keep the stressed stages' total delay in the indicated portion of the plot in Fig. 3(a) throughout a worst-case degradation. The stressed buffers have a high sensitivity to  $\Delta V_{\rm control}$ , and therefore  $\Delta V_{\rm tp}$ , in this region. With the initial  $V_{\rm const}$  bias point set, we sweep this parameter from that point using an off-chip source for a varying number of unstressed stages at the desired measurement temperature, and record the change in  $V_{\rm control}$  for each sweep.

As seen in Fig. 5(a), there is a larger range of  $V_{\text{control}}$  biases which will still allow a phase lock with the delayed reference CLK when fewer unstressed stages are used. If an excessive number is selected, the minimum delay (reached roughly when  $V_{\rm control}$  approaches the nMOS threshold voltage) is not sufficiently low to compensate for the maximum projected degradation in the stressed buffers' delay. However, rather than design with a fixed short delay line, we allow the number of unstressed stages to be varied in order to account for any effects that are not captured correctly in simulations. When  $V_{\text{control}}$  moves to lower values to speed up the unstressed portion of the VCDL, it has a small effect on the stage delay [see Fig. 3(b)], so a large bias change is needed to compensate for the degradation in the stressed stages. Based on the calibration results, we select an optimal number of stages and the pMOS header bias for maximum gain across the projected  $V_{tp}$  degradation range. The resultant  $V_{\rm control}$  versus  $V_{\rm const}$  curve defines the translation of the final measured  $V_{\rm control}$  values into the pMOS threshold degradation characteristic that is sought in NBTI measurements. Note that while the calibration for the initial chip involves an exploration of the possible operating space, subsequent test chips should only require one sweep of  $V_{\text{const}}$  to extract the required translation curve for each temperature of interest, barring significant sensor-to-sensor variation. In our experiments involving fifteen operational test chips with two sensor instances each, no adjustments were required after system parameters were selected for the first tested instance. Each DLL tested after this locked correctly across the entire range of  $V_{\text{const}}$ . In addition, note that it is not expected that the DUTs will experience any appreciable



Fig. 7. (a) Failed phase lock due to the first delayed reference CLK pulse at the phase comparator input being excessively late. (b) Failed lock due to high initial value of  $V_{\rm control}$ . (c) Correct phase lock simulation (with a time gap in the plot due to plot file sizes). Lock is achieved within 18  $\mu$ s in this example, which is representative of standard operation.

aging during the short calibration period, as all voltage drops across any pair of terminals are less than [VCC].

## D. DLL Locking Time and Measurement Delay

The DLL in our application is required to shut down and start up quickly and reliably for each measurement. The startup control circuit pictured in Fig. 2(a) helps to ensure that the DLL will fall into the proper phase lock each time, and improves lock times at a fixed  $\omega_N$ . It accomplishes these tasks by resetting the phase comparator when the DLL is shut down, and pinning  $V_{\text{control}}$  to a bias in middle of its locking range ( $V_{\text{bias}_{\text{initial}}}$ ) until the DLL is switched on. This circuit enables the DLL control unit to begin comparing its two input clock signals only after the MEASSTRESS signal goes high and one full output pulse is detected from the VCDL. Due to this timing feature, and the fact that the reference input to the phase comparator is also delayed by roughly one CLK period, we must simply make sure that the first pulse of this reference input rises before the first VCDL output pulse falls. Fig. 7(a) illustrates the consequence of failing to meet this constraint-the late delayed reference CLK pulse appears to be arriving at the phase comparator earlier than the VCDL output, so  $V_{\text{control}}$  is driven low to compensate. In this example,  $V_{\mathrm{control}}$  is driven all the way to 0 V, and even a harmonic phase lock is not possible. We can prevent this by

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS



Fig. 8. (a) Stress switch capable of driving signals at  $V_{\rm Mp}$  ranging down to -1.2 V. This structure facilitates DC and AC stress conditions. (b) Simulated AC stress waveforms generated from an extracted netlist of this stress switch.  $V_{\rm Mp}$  values do not swing up fully to 1.2 V due to  $V_{\rm minus}$  remaining at -2 V even during this high duty cycle.

selecting a smaller number of reference CLK delay buffers, a larger number of unstressed stages, and/or a higher initial value for  $V_{\text{control}}$ .

Selecting a proper initial value for  $V_{\text{control}}$  [ $V_{\text{bias}_{\text{initial}}}$  in Fig. 2(a)] can prevent harmonic locking, or in the worst case even a complete failure of the DLL to operate as shown in Fig. 7(b). In that simulation, an initial bias of 900 mV causes and excessively long rise time in the unstressed stages in comparison with the total reference CLK period. This effect leads to a shrinking pulse width in the later stages of the unstressed delay chain. If enough unstressed stages are selected, the input CLK pulse may not propagate all the way through the delay chain, and the VCDL output will remain at 0 V, preventing the DLL from operating. In contrast, the simulation results shown in Fig. 7(c) demonstrate the ability of the DLL to quickly lock when a good initial bias, number of unstressed buffers, and reference CLK delay chain length are chosen. Simulations showed that with the application of these proper initial conditions, locking times are less than  $\sim 20 \,\mu s$ , which sets the lower limit for our measurement time.

The main issues that may prevent DLL locking are concisely summarized here: 1) the total VCDL delay is too short with respect to the reference CLK or 2) a high value of  $V_{\rm bias_{initial}}$  in combination with too many unstressed stages prevents a full swing at the VCDL output. The steps taken to prevent this during system calibration are as follows:

- 1) Start with the minimum available number of unstressed stages.
- 2) Choose an initial number of reference CLK delay stages and sweep through  $V_{\text{const}}$  values to check for DLL locking.
- If the DLL fails to lock across the desired operating range, reduce the number of reference CLK stages and repeat the previous step.
- 4) If the DLL fails to lock with the minimum number of reference CLK stages, begin increasing the number of unstressed VCDL stages until a lock is achieved in the  $V_{\text{const}}$ range of interest. As this number is increased, the DLL may fail to lock at high values of  $V_{\text{bias}_{\text{initial}}}$ , but this will be observed during the calibration sweeps and that range can be easily avoided during measurements.

## E. Stress Switch Design for AC Stress Measurements

The circuitry illustrated in Fig. 8(a) can be used in conjunction with the stressed VCDL stages in order to facilitate AC



Fig. 9. (a) Chip microphotograph, (b) measurement lab setup including the LabVIEW software interface, and (c) summary of the test chip characteristics.



Fig. 10. (a) Measured calibration curves. (b) A polynomial is fit to the corresponding gain plots [derived from (1) in this figure], and subsequently used to translate  $\Delta V_{\rm control}$  readings during stress experiments into  $\Delta V_{\rm tp}$  for each sensor.

stress measurements. The Stress\_Clk signal shown in that figure can be held constant at either zero or VCC (1.2 V). The former is applied during the measurement period to bias  $V_X$  at 1.2 V, and the latter is used during DC stress measurements. Alternatively, Stress\_Clk can swing between these values at frequencies up to 50 MHz during a stress period to test for the frequency dependency of NBTI degradation [see Fig. 8(b)]. This frequency limit is imposed by a degraded falling transition time in relation to the total Stress\_Clk period in this 130 nm technology.

 $V_{\text{minus}}$  is set at -2 V in order to pass  $V_{\text{stress}}$  signals ranging down to -1.2 V, due to the fact that pMOS transistors conduct weak low voltages. The use of pMOS devices was necessary with negative voltages on the source and drain terminals so as not to forward bias PN junctions between those diffusion areas



Fig. 11. (a) Measurements taken during  $V_{\rm gs}=-1.0$  V stress. A microsecond-order NBTI recovery is apparent as  $V_{\rm control}$  rises quickly to slow down the stressed stages while their threshold voltage recovers. (b) Fresh DLL readings (with  $V_{\rm gs}=0$  V between measurements) remain relatively flat.

and the substrate. This pMOS-based setup creates a stress condition ranging down to  $V_{\rm gs} = -2*V\rm CC$ , and therefore would not negatively impact our results since creating this stress bias is our goal. The pMOS biased at  $V_{\rm minus}$  is always in strong inversion, so the width of the device stacked above it is skewed up (10  $\mu$ m compared to 0.5  $\mu$ m) in order to drive the internal node voltage ( $V_X$ ) back up to ~ 1.2 V. The MEASSTRESS signal switches between -2 V during stress periods and 1.2 V when the DLL is activated for measurements. Note that  $V_{\rm minus}$ ,  $V_{\rm stress}$ , and ME-ASSTRESS are driven from off-chip, while Stress\_Clk is controlled from off-chip, but the alternating signal for AC stress measurements is created on-chip with a voltage controlled oscillator.

## IV. TEST CHIP MEASUREMENT RESULTS

The proposed NBTI sensor was fabricated in a 1.2 V, 130 nm CMOS process. The dimensions of the pMOS DUTs are  $W/L = 6 \ \mu m/260$  nm. Automated MEASSTRESS signal pulses and other control signals are generated with LabVIEW software, which allows us to take fast measurements of  $V_{\rm control}$ . A chip microphotograph (a), a picture of the measurement setup (b), and the test chip characteristics (c) are shown in Fig. 9.



Fig. 12. (a) Constant stress experiment results. (b) The threshold degradation after 1850 s of stress plotted versus the stress voltage. NBTI degradation is exponentially dependent on this value.

As covered in Section III, our sensor is calibrated using a set of variables prior to applying stress, including the adjustable number of stages in the reference CLK delay structure and in the unstressed stages of the VCDL. After finding the optimal point for maximum gain within our preferred operating space, we extract the  $V_{\text{control}}$  vs.  $V_{\text{const}}$  (and therefore  $V_{\text{tp}}$ ) curve as illustrated in Fig. 10(a). Next, we calculate the gain in each increment and create a gain versus  $V_{\text{control}}$  plot [points in Fig. 10(b)]. Finally, we fit a third-order polynomial to this gain plot (solid lines in Fig. 10(b)), which will later be used to translate the measured  $\Delta V_{\text{control}}$  during stress experiments to a  $\Delta V_{\text{tp}}$  characteristic. This is accomplished by plugging each sequentially measured pair of  $V_{\text{control}}$  values into (1) and (2) in Fig. 10.

During constant stress experiments, three measurements were made per decade of time (on a seconds scale) since NBTI is known to follow a power law behavior. Even during the fast 1 ms measurement pulses with a sampling rate of 35 kHz, which is sufficient to track changes in tens of microseconds, recovery can be clearly observed as  $V_{\text{control}}$  quickly rises from its initial settling value after the DLL is activated [see Fig. 11(a)]. In order to confirm that this rising value is due to NBTI recovery rather than a long control voltage settling time,  $V_{\text{control}}$  was also measured on a fresh sensor that had not undergone any stress. As illustrated in Fig. 11(b), the control voltage settles almost immediately in that case, even when  $V_{\text{const}}$  is adjusted such that  $V_{\text{control}}$ is near the lower end of its operating range. Therefore we infer that the fast rise in  $V_{\text{control}}$  seen throughout the 1 ms measurement periods is due to  $V_{tp}$  recovery in the stressed DUTs. That measurement time was chosen based on our experience with the



Fig. 13. (a) Comparison of NBTI stress measurements at 25  $^{\circ}$ C and 100 $^{\circ}$ C. (b) Comparison of 1 ms and 2 s measurement pulse results. A larger power law exponent is observed with longer measurement times, as found in [25].



Fig. 14. (a) Stress/Recovery curves demonstrate fast recovery when  $V_{\rm gs} = 0$  V. Note that  $\Delta V_{\rm tp}$  does not fully recover in ~1000 s at 25 °C. This behavior was also found by (b) Kim [16] and (c) Shen [17] with high-speed measurement techniques, as well as Varghese [21] with on-the-fly measurements.

test equipment, and maintained for consistency, although the results in Fig. 11 show that our design is capable of providing readings in tens of microseconds. In these particular results, we see that the maximum required measurement time is roughly 57  $\mu$ s or less, since a steady  $V_{\text{control}}$  value is available by the second measured point with a 35 kHz sampling rate. Note that several papers published after this design was fabricated have reported faster measurement times [14], [16], [17], showing that any future implementation of the DLL system presented here should have faster locking times (< 1  $\mu$ s). This is possible with any number of improvements found in DLL design literature.

Constant stress experiment results are displayed in Fig. 12. Stress voltages were varied from -1.6 V to -2.4 V over a period of 1850 s. A power law exponent ranging from 0.107 to 0.121 is observed in Fig. 12(a), matching well with recently published findings [16], [17]. These exponents are smaller than the 0.25 value attributed to a H diffusion limited process, or 0.16 when H<sub>2</sub> is the diffusing reactant, and have been attributed to a faster charge trapping/detrapping process that cannot be correctly observed with slower measurement methods [17]. Fig. 12(b) shows the exponential dependency of the threshold degradation on the stress voltage. Measurements were also obtained at high temperatures, showing accelerated degradation due to this thermally activated process [see Fig. 13(a)]. Fig. 13(b) illustrates the effects of taking measurements at the end of a longer 2 s measurement pulse, and hence, after excessive unwanted threshold voltage recovery. As observed in past publications [25], the power law exponents of results from extended measurement interruption periods are markedly higher.

Successive stress and recovery measurements are presented in Fig. 14(a). A fast and significant recovery is observed, matching the characteristics seen in Fig. 13(b) and (c) [16], [17]. This behavior has been attributed to the charge trapping/detrapping process, and was shown to be incompatible with a slower diffusion limited process [17], in which the recovery process would be slower and stress time dependent. However, it should be noted that this assertion stands in contrast to other recent work, where the fast recovery is said to be compatible with the R-D model in which interface trap creation and passivation is the underlying cause for NBTI transient effects [21].

#### V. CONCLUSION

We have described a fast and efficient on-chip NBTI degradation sensor using a DLL. The shift in pMOS threshold voltage due to stress is amplified and directly translated into the control voltage of that DLL, which facilitates an easy characterization of NBTI in the DUTs. The proposed measurement system is capable of measuring the effects of accelerated DC and AC stress by simply monitoring that control voltage with standard lab equipment. No expensive probe stations are required, and if designed with an emphasis on minimizing area, this system could be used as an on-chip real-time aging monitor to control an aging compensation mechanism such as clock frequency adjustments. A test chip was fabricated in a 1.2 V, 130 nm CMOS process for concept verification. The hardware implementation demonstrated a maximum system gain from the pMOS threshold degradation to a DLL control voltage drop of up to  $16 \times$  in the operating range of interest, with an average gain of  $\sim 10 \times$ . Simulations as well as measurements show that this system is capable of taking threshold voltage readings in tens of microseconds in order to avoid unwanted recovery. NBTI degradation measurements were presented for a range of stress voltages, temperatures, and measurement times, demonstrating the flexibility of our design.

#### ACKNOWLEDGMENT

The authors would like to thank United Microelectronics Corporation (UMC) for the foundry design kit and chip fabrication.

## REFERENCES

- V. Huard and M. Denais, "Hole trapping effect on methodology for DC and AC negative bias temperature instability measurements in PMOS transistors," in *Proc. IEEE Int. Reliab. Physics Symp.*, Apr. 2004, pp. 40–45.
- [2] M. Denais, V. Huard, C. Parthasarathy, G. Ribes, F. Perrier, D. Roy, and A. Bravaix, "New perspectives on NBTI in advanced technologies: Modelling & characterization," in *Proc. IEEE Eur. Solid-State Device Res. Conf.*, Sep. 2005, pp. 399–402.
- [3] R. Vattikonda, W. Wang, and Y. Cao, "Modeling and minimization of PMOS NBTI effect for robust nanometer design," in *Proc. IEEE Des. Autom. Conf.*, Jul. 2006, pp. 1047–1052.
- [4] M. Ershov, R. Lindley, S. Saxena, A. Shibkov, S. Minehane, J. Babcock, S. Winters, H. Karbasi, T. Yamashita, P. Clifton, and M. Redford, "Transient effects and characterization methodology of negative bias temperature instability in PMOS transistors," in *Proc. IEEE Int. Reliab. Phys. Symp.*, Apr. 2003, pp. 606–607.
- [5] G. Chen, M. Li, C. Ang, J. Zheng, and D. Kwong, "Dynamic NBTI of PMOS transistors and its impact on device lifetime," *IEEE Electron Device Lett.*, vol. 23, no. 12, pp. 734–736, Dec. 2002.
- [6] B. Paul, K. Kang, H. Kufluoglu, M. Alam, and K. Roy, "Impact of NBTI on the temporal performance degradation of digital circuits," *IEEE Electron Device Lett.*, vol. 26, no. 8, pp. 560–562, Aug. 2005.
- [7] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, "An analytical model for negative bias temperature instability," in *Proc. IEEE Int. Conf. Comput.-Aided Des.*, Nov. 2006, pp. 205–210.
- [8] M. Denais, C. Parthasarathy, G. Ribes, Y. Rey-Tauriac, N. Revil, A. Bravaix, V. Huard, and F. Perrier, "On-the-fly characterization of NBTI in ultra-thin gate oxide PMOSFET's," in *Proc. IEEE Int. Electron Devices Meet.*, Dec. 2004, pp. 109–112.
- [9] M. Denais, A. Bravaix, V. Huard, C. Parthasarathy, C. Guerin, G. Ribes, F. Perrier, M. Mairy, and D. Roy, "Paradigm shift for NBTI characterization in ultra-scaled CMOS technologies," in *Proc. IEEE Int. Reliab. Phys. Symp.*, Mar. 2006, pp. 735–736.
- [10] S. Aota, S. Fujii, Z. Jin, Y. Ito, K. Utsumi, E. Morifuji, S. Yamada, F. Matsuoka, and T. Noguchi, "A new method for precise evaluation of dynamic recovery of negative bias temperature instability," in *Proc. IEEE Int. Conf. Microelectron. Test Structures*, Apr. 2005, pp. 197–199.
- [11] R. Fernández, B. Kaczer, A. Nackaerts, S. Demuynck, R. Rodriguez, M. Nafria, and G. Groeseneken, "AC NBTI studied in the 1 Hz – 2 GHz range on dedicated on-chip circuits," in *Proc. IEEE Int. Electron Devices Meet.*, Dec. 2006, pp. 337–340.
- [12] T. Grasser, W. Gös, V. Sverdlov, and B. Kaczer, "The Universality of NBTI Relaxation and its implications for modeling and characterization," in *Proc. IEEE Int. Reliab. Phys. Symp.*, Apr. 2007, pp. 268–280.
- [13] S. Kumar, C. Kim, and S. Sapatnekar, "Impact of NBTI on SRAM read stability and design for reliability," in *IEEE Int. Symp. Quality Electron. Des.*, Mar. 2006, pp. 210–218.
- [14] C. Schlunder, W. Heinrigs, W. Gustin, and H. Reisinger, "On the impact of the NBTI recovery phenomenon on lifetime prediction of modern p-MOSFETs," in *Proc. IEEE Int. Integr. Reliab. Workshop*, Oct. 2006, pp. 1–4.

- [15] S. Zafar, "Statistical mechanics based model for negative bias temperature instability," J. Appl. Phys., vol. 97, no. 10, pp. 1–9, 2005.
- [16] T. Kim, R. Persaud, and C. H. Kim, "Silicon Odometer: An on-chip reliability monitor for measuring frequency degradation of digital circuits," *IEEE J. Solid State Circuits*, vol. 43, no. 4, pp. 874–880, Apr. 2008.
- [17] C. Shen, M. Li, C. Foo, T. Yang, D. Huang, A. Yap, G. Samudra, and Y. Yeo, "Characterization and physical origin of fast Vth transient in NBTI of pMOSFETs with SiON dielectrics," in *Proc. IEEE Electron Devices Meet.*, Dec. 2006, pp. 1–4.
- [18] Y. Mitani, M. Nagamine, H. Satake, and A. Toriumi, "NBTI mechanism in ultra-thin gate dielectric – nitrogen originated mechanism in SiON," in *Proc. IEEE Electron Devices Meet.*, Dec. 2002, pp. 509–512.
- [19] M. B. Ketchen, M. Bhushan, and R. Bolam, "Ring oscillator based test structure for NBTI analysis," in *Proc. IEEE Int. Conf. Microelectron. Test Structures*, Mar. 2007, pp. 42–47.
- [20] W. Wang, S. Yang, S. Bhardwaj, R. Vattikonda, S. Vrudhula, F. Liu, and Y. Cao, "The impact of NBTI on the performance of combinational and sequential circuits," in *Proc. IEEE Des. Autom. Conf.*, Jun. 2007, pp. 364–369.
- [21] D. Vargese, G. Gupta, L. M. Lakkimsetti, D. Saha, K. Ahmed, F. Nouri, and S. Mahapatra, "Physical mechanism and gate insulator material dependence of generation and recovery of negative-bias temperature instability in p-MOSFETs," *IEEE Trans. Electron Devices*, vol. 54, no. 7, pp. 1672–1680, Jul. 2007.
- [22] E. Karl, P. Singh, D. Blaauw, and D. Sylvester, "Compact in-situ sensors for monitoring negative-bias-temperature-instability effect and oxide degradation," in *Proc. IEEE Int. Solid State Circuits Conf.*, Feb. 2008, pp. 410–411.
- [23] J. Keane, T.-H. Kim, and C. H. Kim, "An on-chip NBTI sensor for measuring PMOS threshold voltage degradation," in *Proc. Int. Symp. Low Power Electron. Des.*, Aug. 2007, pp. 189–194.
- [24] T. Yang, M. F. Li, C. Shen, C. Ang, C. Zhu, Y. Yeo, G. Samudra, S. Rustagi, M. Yu, and D. Kwong, "Fast and slow dynamic NBTI components in p-MOSFET with SiON dielectric and their impact on device lifetime and circuit application," in *Proc. IEEE Symp. VLSI Technol.*, Jun. 2005, pp. 92–93.
- [25] J. Li, M. Chen, P. Juan, and K. Su, "Effects of delay time and AC factors on negative bias temperature instability of PMOSFETs," in *Proc. IEEE Int. Integr. Reliab. Workshop*, Oct. 2006, pp. 16–19.
- [26] J. Maneatis, "Low-jitter process-independent DLL and PLL based on self-biased techniques," *IEEE J. Solid-State Circuits*, vol. 31, no. 11, pp. 1723–1732, Nov. 1996.
- [27] H. Chang, J. Lin, C. Yang, and S. Liu, "A wide-range delay-locked loop with fixed latency of one clock cycle," *IEEE J. Solid-State Circuits*, vol. 37, no. 8, pp. 1021–1027, Aug. 2002.
- [28] T. Matsummoto, "High-resolution on-chip propagation delay detector for measuring within-chip variation," in *Proc. IEEE Int. Conf. Integr. Circuit Des. Technol.*, May 2005, pp. 217–220.
- [29] A. Chandrakasan, W. J. Bowhill, and F. Fox, *Design of High-Perfor*mance Microprocessor Circuits. New York: IEEE Press, 2001, pp. 235–260.



John Keane (S'08) received the B.S. degree (*summa cum laude*) in computer engineering from the University of Notre Dame, Notre Dame, IN, in 2003, and the M.S. degree in electrical engineering from the University of Minnesota, Twin Cities, in 2005, where he is currently pursuing the Ph.D. degree under the guidance of Prof. C. H. Kim.

He has completed internships with Seagate Technology, Unisys, IBM, Rochester, MN, and the IBM Research Lab, Austin, TX. His research involves developing methods to monitor aging and variation

mechanisms in advanced CMOS technologies, as well as low power design issues. He has coauthored over 15 conference and journal papers.

Mr. Keane was a recipient of the 2009 DAC/ISSCC Student Design Contest and the University of Minnesota Graduate School Fellowship for the 2003–2005 academic years, as well as IBM Ph.D. Fellowship Awards for 2008–2010.



**Tae-Hyoung Kim** (S'06) received the B.S. and M.S. degrees in electrical engineering from Korea University, Seoul, Korea, in 1999 and 2001, respectively. He is currently pursuing the Ph.D. degree from the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis.

In 2001, he joined the Device Solution Network Division, Samsung Electronics, Yong-in, Korea. From 2001 to 2005, he performed research on the design of high-speed SRAM memories. In summer 2007 and 2008, he was with IBM T. J. Watson

Research Center, Yorktown Heights, NY, where he worked on NBTI/PBTI-induced frequency degradation measurement circuit and impact of aging on SRAM mismatch. His research interests include low power and high performance VLSI circuit design in nanoscale technologies.

Mr. Kim was a recipient of a 2008 AMD/CICC Student Scholarship Award, a 2008 Departmental Research Fellowship from University of Minnesota, a 2008 DAC/ISSCC Student Design Contest Award, a 2008 Samsung Humantec Thesis Award (Bronze Prize), a 2005 ETRI Journal Paper of the Year Award, a 2001 Samsung Humantec Thesis Award (Honor Prize), and a 1999 Samsung Humantec Thesis Award (Silver Prize).



**Chris H. Kim** (M'04) received the B.S. degree in electrical engineering and the M.S. degree in biomedical engineering from Seoul National University, Seoul, Korea, and the Ph.D. degree in electrical and computer engineering from Purdue University, West Lafayette, IN.

He spent a year at Intel Corporation where he performed research on variation-tolerant circuits, on-die leakage sensor design and crosstalk noise analysis. He joined the electrical and computer engineering faculty at University of Minnesota,

Minneapolis, MN, in 2004. He is an author/coauthor of over 60 journal and conference papers and has served as a technical program committee member for numerous circuit design conferences. His current research interests include digital, mixed-signal, and memory circuit design for silicon and non-silicon technologies.

Prof. Kim was a recipient of the 2009 NSF CAREER Award, a 2008 McKnight Land-Grant Professorship, a 2008 3M Non-Tenured Faculty Award, a 2009 and 2008 DAC/ISSCC Student Design Contest Awards, a 2006 and 2007 IBM Faculty Partnership Awards, a 2005 IEEE Circuits and Systems Society Outstanding Young Author Award, a 2005 ISLPED Low Power Design Contest Award, a 2003 Intel Ph.D. Fellowship, and a 2001 Magoon's Award for excellence in teaching.