# Hardware Correlator Development at SHAO

Zhijun Xu, Jiangying Gan, Shaoguang Guo

Abstract Hardware correlators have been used in the Chinese Chang'E missions. Recently, a hardware correlator based on uniboard has been developed. This article presents the development of the hardware correlator at SHAO and some results.

Keywords Hardware correlator, FPGA, Uniboard

## **1** Introduction



Fig. 1 Chinese VLBI Network.

The Chinese VLBI Network (CVN) has five stations located in Shanghai, Beijing, Kunming, and Urumqi and a VLBI center in Shanghai processing four stations' data in real time. In the Chang'E-1 project, a Mark IV hardware correlator and, in the Chang'E-2 project, a Mark 5B hardware correlator were used to process the four stations' data in real time, as shown in Figure 2.



Fig. 2 Chang'E-1 and Chang'E-2 hardware correlators.

In the Chang'E-3 and Chang'E-5T1 missions, the Chang'E-3 hardware correlator has been used and performed well. The hardware correlator includes five FPGA boards. Each FPGA board has the same hardware, which consists of one Xilinx Virtex-4 FX60 and four LX160 FPGAs.



Fig. 3 Chang'E-3 hardware correlator.

Shanghai Astronomical Observatory, Chinese Academy of Sciences

The FX60 FPGA includes a 1 Gigabit Ethernet port and two embedded PowerPC405 processors for sending and receiving processing data and control information to and from the outside network.

The FX60 also connects to four LX160 FPGAs via a 64-bit bus to send and receive processing data and control information to each LX160 FPGA. Each LX160 FPGA has a 32-bit cPCI bus connection to the mother board, which connects all five FPGA boards. The FPGA board and the correlator pictures are shown in Figure 3.



#### 2 Hardware Correlator in Uniboard

Fig. 4 Previous hardware correlator diagram.

Right now, we are designing the hardware correlator in Uniboard for the next generation correlator. At first we wanted to use the previous design, as shown in Figure 4.

| Module          |                                                                                 |              |             |
|-----------------|---------------------------------------------------------------------------------|--------------|-------------|
| 10GbE Interface | 128MHz*1bits*1station*32channel                                                 | 8Gbps/chip   | 30Gbps/chip |
| Quantification  | 128MHz*16bits*1station*32channel                                                | 128Gbps/chip | 1           |
| Fringe Stopping | 128MHz*16bits*1station*32channel*2complex                                       | 256Gbps/chip | /           |
| FFT             | 128MHz*16bits*1station*32channel*2complex<br>/2symmetry                         | 128Gbps/chip | /           |
| FSTC            | 128MHz*16bits*1station*32channel*2complex<br>/2symmetry                         | 128Gbps/chip | /           |
| IO Interface    | 128MHz*16bits*4station*8channel*2complex<br>/2symmetry                          | 128Gbps/chip | 75Gbps/chip |
| MAC             | 128MHz*32bits*10baseline*8channel*2complex<br>/2symmetry                        | 640Gbps/chip | /           |
| LTA             | (32bits*32channel*2complex*10baseline*1024fft<br>-size/2symmety)/1s integration | 10Mbps/chip  | 1Gbps/chip  |

Fig. 5 Data speed table for previous design.

But soon we found that this design will have a speed bottleneck. As shown in Figure 5, in the previous design the IO interface is after FSTC (fractional sample time correction). So when the input data speed is 8 Gbps, a speed of 128 Gbps will be needed after FSTC to accommodate the system, but the IO interface is limited to 75 Gbps, so to address the speed bottleneck, the input speed has to be reduced.



Fig. 6 Uniboard hardware correlator diagram.

As shown in Figure 6, in the recent Uniboard hardware correlator design, we are now moving the IO interface after the data playback.

| Module          |                                                                                 | Volume       | Max Vol.    |
|-----------------|---------------------------------------------------------------------------------|--------------|-------------|
| 10GbE Interface | 256MHz *1bits*1station*32channel                                                | 8Gbps/chip   | 30Gbps/chip |
| IO Interface    | 256MHz *1bits*4station*8channel                                                 | 8Gbps/chip   | 75Gbps/chip |
| Quantification  | 256MHz*16bits*4station*8channel                                                 | 128Gbps/chip | /           |
| Fringe Stopping | 256MHz*16bits*4station*8channel*2complex                                        | 256Gbps/chip | /           |
| FFT             | 256MHz*16bits*4station*8channel*2complex<br>/2symmetry                          | 128Gbps/chip | /           |
| FSTC            | 256MHz*16bits**4station*8channel*2complex<br>/2symmetry                         | 128Gbps/chip | /           |
| MAC             | 256MHz*32bits*10baseline*8channel*2complex<br>/2symmetry                        | 640Gbps/chip | /           |
| LTA             | (32bits*32channel*2complex*10baseline*1024fft<br>-size/2symmety)/1s integration | 10Mbps/chip  | 1Gbps/chip  |

Fig. 7 Data speed table for Uniboard design.

As shown in Figure 7, after the 10GbE interface and data playback, the data speed is still 8 Gbps, so the IO interface does not have a speed bottleneck in this case. The following processing that generates a data explosion will be done inside the FPGA chip, so the system will not have a speed bottleneck.

### **3 Some Results**

Implementation of the previous Uniboard hardware correlator functions has been completed, and we have performed some testing on one source sampled by two CDAS2s, as shown in Figure 8. We also tested the previous Chang'E data and compared the results with the Chang'E-3 hardware correlator. Right now, we are designing the new functions for the next mission.



Fig. 8 Test of the same source, sampled by two CDAS2 (16 channel 2 bit 512 MHz BW).



Fig. 9 Test of previous Chang'E data.

## 4 Future Plans

- Deep Space Correlator will:
  - Be expanded to an eight station, 32 channel system;
  - Support the VDIF VSR format;
  - Support multi-target parallel processing;
  - Support 4, 8 and 16 bit sampled data.
- VGOS Correlator will:
  - Be expanded to a 16 station, 32 channel system;
  - Support a real time mode;
  - Support an output SWIN format;
  - Be updated to Xilinx board or Uniboard2;
  - Support VGOS mode with a maximum of 16 Gbps per station.

#### References

- Xu, Z. J., et al. Real Time Correlator in FPGA, in International VLBI Service for Geodesy and Astrometry 2006 General Meeting Proceedings, edited by Dirk Behrend and Karen Baver, NASA/CP-2006-214140, p. 89-92, 2006.
- Zhang, X. Z., et al. CVN Correlator and its Future, New Technologies in VLBI. Astronomical Society of the Pacific, Conference Series Volumes, v. 306, 287-300, 2003.
- Zhang, X. Z., et al. Progress of Wideband VLBI Digital System Development at SHAO. IVS 2008 General Meeting Proceedings, "Measuring the Future", edited by Andrey Finkelstein and Dirk Behrend, pp. 381–385, 2008.