## FAULT TOLERANT DIGITAL CONTROLLER FOR DC-DC SWITCHING POWER CONVERTER USING MODULAR REDUNDANCY

Daniel Dilbie<sup>1,\*</sup>, Getachew Alemu<sup>1</sup>

<sup>1</sup>School of Electrical and Computer Engineering, Addis Ababa Institute of Technology (AAiT), Addis Ababa University (AAU) \*Corresponding author. E-mail address: daniel@aait.edu.et DOI: https://doi.org/10.20372/zede.v41i.8627

#### ABSTRACT

Power converters and regulators are the main and critical building blocks of all electronic systems. In applications prone to transient faults, such as particle strikes, spatial redundancy techniques can improve the reliability significantly. The design of a fully digital DC/DC switching buck converter regulator based on a fault tolerant redundancy architecture modular implemented on SRAM FPGAs is presented. Transient events such as single event functional interrupts (SEFIs) are the dominant effects in SRAM-based FPGAs. SEFIs result in missing pulses in the generated PWM control signal of the converter that cause large transient drops at the converter output. In this work, triple modular redundancy (TMR) technique is used to implement spatial redundancy. This approach is used to triplicate the physical digital blocks on FPGA such that faults on one of the modules can be detected and corrected while the system works uninterrupted and with correct output. Experimental results indicate that with triple modular redundancy, the power converter can withstand up to 5x more doze of faults as compared to conventional power converters.

**Keywords**: single event functional interrupt (SEFI), fault tolerant design, spatial redundancy, DC/DC switching power converter, buck converter, digital control

### 1. INTRODUCTION

DC/DC switching power converters, due to their high-efficiency conversion, are essential parts of modern electronic systems [1, 2]. The main objective of this work is to design a fault tolerant FPGA-based digital control system for a DC-DC buck converter/regulator. Digital control systems significant advantages have over conventional analog pulse width modulators (PWMs) [3, 4].

Along with the general advantages of a digital system (such as high flexibility, reduced sensitivity to noise and component parameter variations, and the capability to realize sophisticated control algorithms), a digital controller can be hardened more easily against transient faults than its analog counterpart. In fact, a conventional PWM switching converter is very susceptible to single-event effects (SEEs) in the error amplifier stage and in the analog PWM controller that cause large transient pulses at its output [5-7]. The use of commercial SRAM-based FPGAs in applications such as solar power converters in satellites is very attractive because of their high component density, quick turn-around time, and re configurability [7, 9].

Normally, SRAM-based FPGAs are very sensitive to SEEs. The dominant effect is the single event functional interrupt (SEFI) caused by a configuration memory bit upset that disrupts the continuous operation of the system in which the FPGA is used. Heavy ion radiation testing on these devices has shown that no permanent faults occur after an SEFI, and that reprogramming the FPGA will restore full functionality [10, 11]. Therefore, "radiation hardening by design" (RHBD) techniques based on detecting, mitigating and correcting SEFIs make it possible to use SRAM-based FPGAs in high radiation environments [12].

The paper is organized as follows. Section II describes the design of a digitally controlled switching buck converter. In Section III the main radiation effects on SRAM FPGAs are discussed and the technique applied to harden the design against radiation-induced errors is illustrated. Section IV shows the simulated and experimental results, and section V concludes and summarizes the work.

#### 2. DIGITALLY CONTROLLED BUCK CONVERTER

A DC/DC switching buck converter steps down an unregulated input voltage, such as from a solar panel, to a regulated output voltage for a wide range of input voltages and load conditions. The circuit uses an inductor, a power MOSFET to transfer power from the input to the output in an efficient way, a capacitor to smooth out the ripple in the output, and a freewheeling diode (or a MOSFET synchronous rectifier for highest efficiency). The MOSFET is periodically switched from the on state to the off state and vice versa. The ratio between the on-state time interval and the switching period (duty cycle) determines the DC value of the output voltage. To maintain the required output voltage in the presence of variations in the input voltage and load conditions, a feedback network, which controls the duty cycle, is used [1].

Figure 1 shows the basic structure of a digitally controlled DC/DC switching buck converter in which the feedback loop is implemented using an analog-to-digital converter (ADC). The actual output voltage Vo is scaled by the sensor gain and converted into a digital signal by means of a sampling process and analog-to-digital conversion (ADC). The difference between the digitized sample of the converter output and the reference voltage (Ref) forms an error signal e that is processed by the digital compensator to calculate the actual duty cycle c. The digital pulse width modulator (PWM) generates the switching signals (H and L) that control the two power MOSFETs.



Figure 1 Basic schematic of a digitally controlled buck converter.

In this work, triple modular redundancy (TMR) technique is used to implement spatial redundancy as shown in Figure 2. This approach is used to triplicate the physical digital blocks on FPGA such that faults on one of the modules can be detected and corrected while the system works uninterrupted and with correct output. The classic three voter configuration provides the best reliability compared to all other configurations [13]. The drawback of the classic three voter is that it takes the largest in hardware implementations. space However, for small circuits, such as this particular application, space is not a constraint. Therefore, it is justified to choose the classic three voter configuration to achieve the best reliability [14]. The reference, (Ref) is continuously compared with the feedback signal (FB). The error signal from the comparator is passed to the proportional and integral (P) compensator. The compensators output is then used to set the duty cycle of the Pulse with modulator (PW). The majority voter logic (V) takes the correct majority of the three parallel outputs at each stage and the PWM check logic makes sure the integrity of the pulse output.



Figure 2 Block diagram of the designed TMR controller.

### 3. SINGLE EVENT EFFECTS ON SRAM FPGAS AND FAULT TOLERANT DESIGN

Heavy ion testing on SRAM FPGAs from different semiconductor manufactures has shown that these parts are very sensitive to radiation-induced errors [9-11]. The dominant error is the single event functional interrupt, which is caused by a configuration memory bit flip. It is worth noting that a user flip-flop is as susceptible as a configuration memory cell to radiation-in duced upset.

However, the number of configuration bits is much greater than that of the user flipflops and thus the probability of an SEFI is significantly greater than the probability of an SEU(single event upset) associated with a user register. Moreover, no permanent faults, such as single event latch-up or dielectric rupture, have been observed during heavy ion tests. In addition, reprogramming the FPGA will restore the full functionality of the device, after the occurrence of an SEFI. Therefore, RHBD techniques based on detecting, mitigating and correcting SEFIs make it possible to uses RAM FPGAs in high radiation environments [12, 13].

When applying a digital control signal using an SRAM-based FPGA in a switching converter, radiation induced SEFIs result in missing pulses in the generated PWM control signal of the converter. In turn, the missing pulses result in large transient voltage drops at the output of the converter that may adversely affect the operation of the powered systems [5, 6]. Therefore, an RHBD technique must be applied to mitigate and correct the SEFIs. In the work described here, are dundant approach at both the logic design and the device levels has been applied.

## 4. EXPERIMENTAL RESULTS

In order to validate this approach a TMR digital circuit was implemented on Xilinx Zynq-7000 SoC. This SoC integrates Xilinx Artix-7 FPGA and a dual core ARM cortex-A9 MPU on a single chip. The TMR architecture in Figure 2 is implemented on the FPGA while the Cortex-A9 is used to simulate single event effects on the FPGA configuration memory. Table 1 summarizes the FPGA resources used for conventional controller design and for TMR implementation with triplicated voters.

According to data from Xilinx datasheet [15], the cost per logic cells (LCs) on the Artix-7 series FPGA is roughly linear at USD 0.0018. The total cost of implementing a standard controller on FPGA has insignificant hardware cost at only 0.06% of

LCs used. Also, the TMR version, even if it requires 3x more resources it's still less than 0.19% of the FPGA resource while providing 5x more reliability than the standard controller.

**Table 1** Comparison between standard and TMRdesigns in Artix-7 XC7A200T

| Digital DC-DC     | Area    | # Logic | # CLB   |
|-------------------|---------|---------|---------|
| controller design | (#      | cells   | Flip-   |
| on SRAM FPGA      | slices) | (LC)    | flops   |
| Total FPGA        | 33,650  | 215,360 | 269,200 |
| resources         |         |         |         |
| Standard: No      | 9       | 128     | 12      |
| TMR used          |         |         |         |
| Percentage used   | 0.027   | 0.059 % | 0.004 % |
|                   | %       |         |         |
| Full TMR          | 21      | 402     | 41      |
| implementation    |         |         |         |
| Percentage used   | 0.062   | 0.187 % | 0.015 % |
|                   | %       |         |         |

The implementation takes a typical voltage mode control loop designed to give out a regulated 5V output from an unregulated input of 12 - 24V DC (typical solar panel voltage ranges). First, the output voltage is sensed, scaled down and fed to an analog to digital converter which feedback the digital equivalent of the output voltage to the control loop. Then, the feedback is subtracted from a fixed reference to get the error signal. The error signal is passed through the PID compensator to get the control value. The control value determines the pulse width of the next control cycle. The difference equation was implemented in a control loop with a bandwidth of 50 KHz.

 $U(k) = U(k-1) + K_P \times error + K_I x integral + kd$ (1)

where,  $U(k) = current \ control \ value$ ,

 $U(k-1) = revious \ control \ value,$   $error(k) = current \ error,$   $error(k-1) = previous \ error,$   $K_P = Proportional \ constant,$   $K_I = Integral \ constant,$  $K_D = Derivative \ constant$  The process is broken down to three modules; the comparator module, the PID module and the PWM generator module. The three modules are triplicated to compute the outputs and compare with each other. The error, the control and the pulse width are the three values to be voted on in TMR.

At steady state, with no fault, the duty cycle of the output square wave is constant for a fixed load and constant input. In this particular case, at no load and 12V constant input, the duty cycle is 42% 53% for half load of 2.5W and 65% at a full load of 5W.

### 4.1 Fault Impact Analysis

Fault impact analysis is done by observing the duty cycle of the output pulse from the PWM generator at fixed load conditions. For this test, the load and input voltages are kept constant to observe only the impact of transient faults on the output. The period of the output square wave is kept constant at 1ms (1 KHz PWM frequency) and the duty cycle is varied by the controller to keep the output voltage regulated at  $5V \pm 1\%$ . A momentary fault in the controller will result in a wrong duty cycle of the pulse which may cause a temporary output overshoot or undershoot.

## 4.2 Baseline Case

This case is used as a reference. In this case, the system runs without TMR and with no fault injected. The test was run for 10,000 PWM cycles (10 seconds) at steady state and the following result was obtained (see Table 2).

## 4.3 Result at Injection Rate of 1/3000 (one in 3000 cycles)

As in the previous case, the test was run for 10,000 pulse cycles (10 seconds) at steady state. As shown in Table 3, both overshoot and undershoot conditions were observed in the data (even if it is too fast to observe on the scope). Although, the output regulation

is not affected as much, the transient could damage the power switches at higher load currents if occurs frequently.

| Parameters                          | No load          | 2.5W<br>load     | 5W load          |
|-------------------------------------|------------------|------------------|------------------|
| Maximum<br>duty cycle               | 438μs<br>(43.8%) | 542µs<br>(54.2%) | 670μs<br>(67%)   |
| Minimum<br>duty cycle               | 412μs<br>(41.2%) | 510μs<br>(51%)   | 645μs<br>(64.5%) |
| Duty cycle<br>standard<br>deviation | 11µs             | 13µs             | 19µs             |
| %deviation                          | 2.6%             | 2.7%             | 2.9%             |
| Maximum<br>output<br>voltage        | 5.06V            | 5.07V            | 5.08V            |
| Minimum<br>output<br>voltage        | 4.98V            | 4.97V            | 4.95V            |
| Output<br>standard<br>deviation     | 0.04V            | 0.04V            | 0.05V            |
| %deviation                          | 0.8%             | 0.82%            | 1%               |

Table 2 Baseline case

 Table 3 Output parameter at fault rate of 1/3000

| Parameters                          | No load        | 2.5W<br>load     | 5W load          |
|-------------------------------------|----------------|------------------|------------------|
| Maximum<br>duty cycle               | 820µs<br>(82%) | 861µs<br>(86.1%) | 924µs<br>(92.4%) |
| Minimum<br>duty cycle               | 12μs<br>(1.2%) | 28µs<br>(2.8%)   | 46μs<br>(4.6%)   |
| Duty cycle<br>standard<br>deviation | 21µs           | 26µs             | 31µs             |
| %deviation                          | 5%             | 4.9%             | 4.8%             |
| Maximum<br>output<br>voltage        | 6.88V          | 5.87V            | 5.98V            |
| Minimum<br>output<br>voltage        | 3.78V          | 3.86V            | 3.21V            |
| Output<br>standard<br>deviation     | 0.09V          | 0.05V            | 0.07V            |
| %deviation                          | 1.8%           | 1.2%             | 1.4%             |

#### 4.4 Higher Fault Rates

The results shown in Table 4 were obtained at different fault injection rates for 2.5W and 5W load. It is observed that at injection rates lower than (1/1000), the controller loses stability and the output voltage starts to oscillate in both cases.

| Table 4 Out  | put parameters | at higher | fault rates |
|--------------|----------------|-----------|-------------|
| (2.5W/5W loa | ds)            |           |             |

| Fault<br>injection rate | Duty cycle<br>deviation (%) | Output voltage<br>deviation (%) |
|-------------------------|-----------------------------|---------------------------------|
| 1/5000                  | 2.8/3.1                     | 0.86/1.0                        |
| 1/2000                  | 8.3/11.2                    | 2.1/2.8                         |
| 1/1000                  | 21/28.4                     | 8.1/8.7                         |
| 1/500                   | 40.2/48.9                   | 18.4/21.5                       |

#### 4.5 Results with Proposed TMR Architecture

With TMR, the tests were run again with the same conditions. For the case with no faults injected, the result was identical with the baseline case.

#### 4.6 Result at Injection Rate of 1/3000

No overshoot and undershoot conditions were observed and the result was fairly the same as in the case of no faults (see Table 5).

#### 4.7 Result at Higher Fault Rates

The following results (Table 6) were obtained at different fault injection rates for 2.5W and 5W loads. The controller is observed to be fairly stable at up to injection rates of 1/100. For both load cases, overshoot and undershoot conditions are observed starting from injection rates of 1/200.

| Parameters                          | No load          | 2.5W<br>load     | 5W load          |
|-------------------------------------|------------------|------------------|------------------|
| Maximum<br>output voltage           | 5.07V            | 5.07V            | 5.08V            |
| Minimum<br>output voltage           | 4.98V            | 4.96V            | 4.93V            |
| Output<br>standard<br>deviation     | 0.044V           | 0.046V           | 0.052V           |
| %deviation                          | 0.88%            | 0.9%             | 1.04%            |
| Maximum<br>duty cycle               | 441µs<br>(44.1%) | 541µs<br>(54.1%) | 673μs<br>(67.3%) |
| Minimum<br>duty cycle               | 410μs<br>(41%)   | 528µs<br>(52.8%) | 642μs<br>(64.2%) |
| Duty cycle<br>standard<br>deviation | 11.3µs           | 15µs             | 20.2µs           |
| %deviation                          | 2.69%            | 2.77%            | 3.1%             |

**Table 5** Output parameters with TMR and fault rateof 1/3000

**Table 6** Output parameters with TMR for higherfault rates (2.5W/5W loads)

| Fault<br>injection rate | Duty cycle<br>deviation (%) | Output<br>voltage<br>deviation (%) |
|-------------------------|-----------------------------|------------------------------------|
| 1/2000                  | 2.75/3.2                    | 0.98/1.05                          |
| 1/1000                  | 3.8/4.3                     | 1.02/1.18                          |
| 1/500                   | 6.1/6.8                     | 1.41/1.82                          |
| 1/200                   | 12/12.3                     | 2.1/2.8                            |
| 1/100                   | 20.9/21.2                   | 5.77/6.1                           |
|                         |                             |                                    |

# 5. CONCLUSION

Power converters and regulators are the main and critical building blocks of all electronic systems. In applications prone to transient faults such as particle strikes, spatial and time redundancy techniques can improve the reliability significantly.

Single event functional interrupts are the dominant radiation effects in SRAM-based FPGAs. This work demonstrates the design of an SEFI-resistant DC/DC switching power converter based on a reconfigurable digital control loop implemented in SRAM FPGAs.

In this project, the results indicate that with classic triple modular redundancy, the power converter can withstand up to 5x more doze of faults as compared to conventional power converters. The additional 3x more hardware requirement is justified because the implementation only used less than 0.2% of the total resource. For such a small circuit the benefits of 5x more reliability very much outweighs the additional hardware cost.

# **CONFLICT OF INTEREST**

There is no conflict of interest in this work.

# ACKNOWLEDGEMENTS

The authors are grateful to Addis Ababa University, Addis Ababa Institute of Technology, School of Electrical and Computer Engineering for the support provided.

# REFERENCES

- [1] Cuk, S.R. andMiddlebrook, D., "Basics of Switched-Mode Power Conversion: Topologies, Magnetics, and Control", in Advances inSwitching Mode Power Conversion. Pasadena, CA:TESAco, vol. 2.1981.
- [2] Capel, A., O'Sullivan, D. and Marpinard, J.C., "High-Power Conditioningfor SpaceApplications", Proc. of the IEEE, vol. 76, no. 4, 1998, pp. 391-408.
- [3] Martin, T.W. and Ang, S.S., "*Digital Control for Switching Converters*", Proc. of IEEE Int.Symp. on Industrial Electronics, ISIE'95, vol. 2, 1995, pp. 480-484.
- [4] Dancy, A.P., Amirtharajah, R. and Chandrakasan, A.P., "High-efficiency Multiple-output DC-DC Conversion for Low-voltage Systems", IEEETrans. on VLSI. Systems, vol. 8,

Fault Tolerant Digital Controller for DC-DC Switching Power Converter ....

no. 3, 2000, pp. 252-263.

- [5] Ribeiro, E, Marques Cardoso, A.J. and Boccaletti, C., "Fault Tolerant Strategy for a Photovoltaic DC-DC Converter", IEEE Trans. on power Electronics, July 2013.
- [6] Khanke, D., "Design and Implementation of High Reliability PWM Modulator UsingTMRwith spare arrangement", IJSR vol. 3, no. 7,2014, pp. 1595-1597.
  - [7] Mamatha, S.,Shubha Rao, K. and Chakravarthi, V.S., "FPGA Based Digital Controller for DC-DC Buck Converter", IJIRCCE,vol. 3, 2015, pp 58-60.
  - [8] Miro, M., Mitja, T. abd Primoz, S., "FPGA Implementation of Digital Controller for DC-DC Buck Converter", IEEE IDEAS, vol. 5, 2005, pp. 161-164
  - [9] Adam, M.J., "Reconfigurable Fault Tolerance for Space Systems", PhD dissertation, University of Florida, 2013, pp. 81-84.
  - [10] Heijmen, T., "Radiation Effects and Soft Errors in IC and Electronics Devices", World Scientific Ltd, 2004, pp. 250-308.

- [11] Jing,N., Lee, J. andZhe, F., "SEU Fault Evaluation and Characteristics for SRAM Based FPGA Architectures and Synthesis Algorithms", ACM transaction on Design Automation of Electronic Systems, vol. 18, no. 1, 2012, pp 142-151.
- [12] Lee, S. and Jae-il, J., "Voting Structure for TMR modules" Electronics Express, Vol 4, No 21, 2014, pp 94-96.
- [13] Siegle, F., Tanya, V., Ilstad, J. and Emam, O., "Mitigation of Radiation Effects in SRAM Based FPGAs for Space Applications", ACM inc., Vol 1, 2015, pp. 16-19.
  - [14] Hamamatsu, M. andTsuchiya, T., "On the Reliability of Cascaded TMR Systems", the 2010 Pacific Rim International Symposium on Dependable Computing, Vol 2, pp. 184-190.
- [15] https://www.xilinx.com/products/silic on-devices/fpga/artix-7.html (accessed on: July 23, 2022.)