## A 65nm and 130nm CMOS programmable analog standard cell library for scalable system synthesis

Pranav Mathews, Praveen Ayyappan, Afolabi Ige, Swagat Bhattacharyya, Linhao Yang, and Jennifer Hasler

## Georgia Institute of Technology, Atlanta, GA, USA

Analog IC design for analog computing requires a similar toolflow and synthesis as large-scale digital. Analog synthesis to custom IC design requires an analog standard cell library that builds upon the experience and synthesis of large-scale Field Programmable Analog Arrays (FPAA). This effort presents the first programmable analog standard cell library, where Floating-Gate (FG) devices in standard CMOS provide the high precision (e.g. 14bit) [1] programmability across multiple process nodes. A single library, developed in both 65nm CMOS and Skywater (open-source) 130nm CMOS, facilitates high-level synthesis [2], enabling the creation of versatile analog computing applications that are adaptable across various process nodes. Programmability is essential to have a moderate number of cells cover a wide space of analog computing and system design.

This effort describes the design and characterization of a reprogrammable analog standard cell library in 130nm and 65nm CMOS processes, an approach mirroring digital design (Fig. 1). Multiple abstraction layers streamline analog system design similar to digital design, enabling integration with high-level synthesis tools [2]. The 4x2 array FG FET cell (direct and indirect programming) sets the architecture (Fig. 1) and sets the system pitch (6.5µm for both 130nm and 65nm) for all standard cells. These cells are packed into islands of FG cells, non-FG cells, and supporting circuitry for the system place and route. Island architecture describes the abutting of FG cells to share infrastructure and create dense cell placement.

Both 130nm and 65nm cells qualitatively have the same measurements with different parameters, such as the FG pFET gate sweeps (Fig. 1) due to FG programming (decrease current by electron-tunneling and increase current by hot-electron injection). Many of the parameters are adjusted through FG programming. This effort focuses primarily on the 65nm measurements (die photo). Each library cell can be finely programmed to meet a broad spectrum of analog metrics, simplifying analog metric tradeoffs to power-delay products and noise (all proportional to capacitance). Our focus will be characterizing these blocks within this understanding. The test structure contains all the cells in the library: FG array cells, bias currents, multiple nFET and pFET sizes, multiple Transconductance Amplifiers (TA) that can have FG biases and inputs, Winner-Take-All (WTA) cell, a 6-bit voltage DAC and current DAC, and multiple instrumentation cells. This universal analog library enables numerous analog computing blocks, such as Vector-Matrix Multiplication (VMM) and classifiers, as well as applications, such as audio and image processing.

TAs and circuits composed thereof (Fig. 2) are crucial for analog signal processing. Our TAs use a 9-transistor topology with a programmable FG pFET bias and a linear input differential range of ±85mV. FG TAs additionally use FG FETs as the input differential pair, with capacitive attenuation into the gates, thereby allowing compensation of threshold voltage mismatch and extending TA input linear range upto ±850mV. Both TAs compose the C4 active bandpass filter, which uses a feedforward TA and a feedback FG TA to set the upper and lower corners, respectively. The maximum possible C<sup>4</sup> filter gain is set by the feedforward capacitance ratio  $(C_1/C_2 = 9$  in this work). The flexibility of the C<sup>4</sup> filter, which enables tuning of the corners and Q-factor, is crucial for spectral decomposition in scalable acoustic front ends.

Machine learning classification encompasses an important range of topics which analog computing is well suited to handle. Fig. 3 demonstrates the capabilities of a single layer universal classifier comprised of a VMM and WTA circuit to solve the non-linear XOR problem. Classification is possible due to the competitive dynamics of the WTA. The VMM is a 3x3 crossbar of FG FETs, where the first column acts as a constant reference, and the last two columns act as logic function inputs. The FG weights on the VMM are programmed such that when the inputs match, the XOR row produces less current than its counterparts. The row currents are fed

into the WTA which is programmed to ensure only a single winner is possible. This lateral inhibition, where a winner sinks all the current by starving its competitors, is what allows the XOR node to draw two decision boundaries where a perceptron could only draw one boundary.

An Arbitrary Waveform Generator (AWG) can be used to output a programmable vector for matrix-matrix multiplications or general analog waveform production. In Fig. 4, the AWG is programmed to output sine and sawtooth waveforms, using an FG FET VMM to store coefficients and an analog scanner to choose which columns to select. When a column is selected, the programmed currents on active input rows



are summed and converted into an output voltage once per clock cycle. The frequency of the output waveform depends on the clock signal fed into the scanner, and the resolution depends on the FG programming accuracy and the number of VMM columns. Since FG programming accuracy is high, our approach allows users to specify a precise analog waveform at some desired frequency.

FG FET hot electron injection rates were characterized using constant drain pulsing (S-curve), Autozeroing FG Amplifier (AFGA)based programming and an AFGA-based adaptation circuit (Fig. 5). Both approaches involve isolating a single FG pFET from a VMM. In the drain pulsing method, the FG FET is initially tunneled to increase the FG charge, and the drain is pulsed between Vini (injection voltage) and GND, using a constant pulse width of 50µs. Each pulse creates a high drain-to-channel electric field, enabling electrons to gain enough energy to overcome the Si-SiO<sub>2</sub> barrier and inject towards the FG. After each pulsing sequence, the injection process generates a negative charge on the FG node, allowing for increased channel current in the FG pFET. Through this experiment we can obtain the S-curves (injection rate) for different gate voltages with respect to the channel current and pulse number. Injection also happens in the AFGA structure, which is a bandpass FG amplifier that adaptively sets a DC operating point by balancing the FG charge through simultaneous tunneling and injection. By tunneling the FG pFET, the Vout is shifted from its steady state, as the nFET bias temporarily exceeds the tunneled FG pFET current. To attain equilibrium, the AFGA injects the FG pFET, eventually increasing Vout to the steady state value. Through this process, we can inject the FG FET at a different rate by adjusting the injection and tunneling voltages; we can inject to a different bias current by changing the nFET gate voltage ( $V_{\tau}$ ). The adaptive and the bandpass characteristics of the AFGA are simultaneously demonstrated by providing a sine wave superimposed on a pulse train, ensuring that the sine wave period is lower than the adaptive rate. The AFGA outputs a bandpass transient response to the pulse train while amplifying the sine wave.

Standard cell performance metrics are compared with other work on application-specific ICs (ASICs) and general-purpose FPAAs (Fig. 6). We observe a significant lowering of injection and tunneling voltages at smaller process nodes. Our TAs and FG TAs compare favorably to other approaches [4,5], with the FG TA demonstrating excellent linearity. Our bandpass filter tuning range compares favorably to other work [3-5] and occupies significantly less area. All our standard cells occupy less area than similar circuits in other works, which are primarily in 350nm processes. Our standard cells in both 65nm and 130nm have the same pitch and occupy similar areas, facilitating scalability and a seamless transition between technology nodes. This work is an enabling innovation in analog system design, demonstrating the utility of programmable FG FET standard cells to streamline and economize modern CMOS design.

## Acknowledgement:

We thank Kevin Nielson for aid packaging chips and populating test boards.

## **References:**

[1] J, Hasler, CICC, 2022

- [3] S. George, IEEE TVLSI, 2016
- [2] A, Ige, JLPEA, 2023.

- [5] M. Diab & S. Mahmoud, Microel J 2020
- [4] B. Rumberg et al, IPSN, 2015



Fig. 1. 65nm and 130nm programmable analog standard cell library concept for system synthesis in island architectures. We achieve programmability using FG pFETs. FG FETs enable the designer to reshape FET IV characteristics, allowing mismatch compensation and flexible current biasing. The 4x2 direct VMM crossbar, sets the overall standard cell pitch.



Fig. 3. 65nm ML classifier (a) 3D decision boundaries and time domain waveforms of a single-layer universal classifier solving the AND, XOR, and NOR Boolean classification problems. (b) Schematics of the VMM and WTA circuits comprising the classifier





Fig. 2. Programmable 65nm TA circuits. a)  $I_{out}$  of TA and linear FG TA. b) FG TA offset adjustment. c) C<sup>4</sup> bandpass filter. d) Low-pass filter. FG biasing preserves power-filter-time-constant product.



Fig. 4. Characterization of the 65nm arbitrary waveform generator: a) Output of the arbitrary waveform generator with a sine and sawtooth programmed into it (sine wave offset for clarity). b) Currents programmed into the VMM to generate the two waveforms. c) Schematic of the waveform generator.

| Comparison of Configurable Analog Blocks (After padding to pitch. Courtesy of the author |                           |                    |        |                            |                        | sy of the authors.) |
|------------------------------------------------------------------------------------------|---------------------------|--------------------|--------|----------------------------|------------------------|---------------------|
|                                                                                          |                           | Proposed           |        | [3] TVLSI 2016             | [4] ISPN 2015          | [5] Microel J 2020  |
| Tech Node (nm)                                                                           |                           | 65                 | 130    | 350                        | 350                    | 90                  |
| Programmable Biasing?                                                                    |                           | Y                  | Y      | Y                          | Y                      | N                   |
| Voltage (V)                                                                              | Supply                    | 1.80               | 1.6    | 2.5                        | 2.5                    | 1.2                 |
|                                                                                          | Injection                 | 3.25               | 4.0    | 6.0                        | 6.0                    |                     |
|                                                                                          | Tunneling                 | 6.25               | 8.0    | 12.0                       | 12.5 <sup>†</sup>      |                     |
| Bandpass<br>Filter<br>Metrics                                                            | TA Input Lin. (mV)        | ±85 - ±850         |        | ±60 - ±670 <sup>†</sup>    | ±80 <sup>†</sup>       |                     |
|                                                                                          | Bandwidth (kHz)           | 1.8-14             |        | 0.1-4                      | 0.02-20 <sup>†</sup>   | 0.133-0.467         |
|                                                                                          | Power (µW)                | 1.1-7.6            |        | 0.02-0.6 (est)             | 0.001-1.1 <sup>†</sup> | 0.005 (sim)         |
| Cell Pitch (µm)                                                                          |                           | 6.5                | 6.5    |                            |                        |                     |
| Area (μm²)                                                                               | Bandpass Filter           | 420.2              | 266.6  | 222,071 <sup>†</sup>       | 133,000 <sup>†</sup>   |                     |
|                                                                                          | 2-TA (FG Bias)            | 116.2 <sup>*</sup> | 116.5  | 3,733 <sup>†</sup>         |                        |                     |
|                                                                                          | 2-FG TA                   | 175.4*             | 182.6  | 5,599 <sup>†</sup>         | -                      |                     |
|                                                                                          | 4x2 VMM (Direct)          | 119.11             | 130.8  | 1,452 <sup>†</sup>         |                        |                     |
|                                                                                          | 4x2 VMM (Indirect)        | 136.8              |        |                            |                        |                     |
|                                                                                          | 4-bit Cap Array           |                    | 238.6  | 1,577 <sup>†</sup> (3-bit) |                        |                     |
|                                                                                          | 4-SPST T-gate             | 31.4               | 30.9   | 1,428 <sup>†</sup>         |                        |                     |
|                                                                                          | 6-bit DAC                 | 6,533              | 6,632  |                            |                        |                     |
|                                                                                          | 4-WTA                     | 89.4*              | 91.5   |                            |                        |                     |
|                                                                                          | AWG                       | 21,036             |        |                            |                        |                     |
|                                                                                          | Scanner (4-Out)           | 177.1              | 304.6  |                            |                        |                     |
|                                                                                          | FG FET Char Cell          | 198.1              | 238.6  |                            |                        |                     |
|                                                                                          | FG Bootstrap              | 119.7              | 194.7* |                            |                        |                     |
|                                                                                          | 2 nFET+2 pFET<br>(W/L=10) | 18.1*              | 22.9   |                            |                        |                     |

Fig. 6. Performance summary and comparison with both application-specific and general-purpose FPAA implementations.