0 ) Goals to achieve

Main items to take into consideration are:

a) definition of a new system to be used as backend, than performing all the signal processing functionalities included between the receivers output and the data transmission or recording;
b) digital implementation of the entire set of functionalities included in the point above, today performed partially in the digital and in part in the analog domain;
c) important improvement of reliability and technical performance, for getting improved data quality;
d) cost reduction with respect to the traditional terminal, for a possible broad usage;
e) flexibility in terms of functionality and upgrading possibility.

An initial consideration it seems important is that in principle the backend section in a today VLBI chain could range between a very simple and a very complex system. This appears evident if we consider the extreme sides of the process:

1) sampling and digitizing the IF and without any further operation, neither related to down conversion, nor to filtering, then sending the data stream to the correlator through a standard interface; wide band correlation and selection of the needed information could do the job, having available a proper modified correlation processor;
2) producing the down converted narrow bands, as now they are used, with even more properties.

Our analysis suggest to take into consideration as much as possible a solution of continuity with the present equipment, so that we will consider the most complete solution, keeping into consideration as alternative any possible simplification. Of course the more complete and flexible solution, the more costs and developing time.

The need of defining a new backend system has been widely commented in the last few years, and it was clearly stressed as the increasing needs in terms of bandwidth is one of the main items together
with the opportunity to increase the observative frequency band. Than some assumptions can be done for setting few hypothesis in order to try and delineate a system including what can be the needs for a near future.

Frequency bands taken into considerations should anyway be limited to few tens of GHz, having then to be processed slots of such bands. Technological present capability can give us a prevision about the possibility we could meet for processing such amount of data, but a reasonable approach should suggest to make provision for a scalable system able to processes multiple blocks of frequency range in a standard as wide as possible bandwidth. Such dimension seems to be reasonably possible to be defined as few GHz. Such limitation is motivated by cost reasons; the more commercial parts, the more cost saving.
So in the fig.1 we could see as such system would appear, as example. Two or more receiver bands are feeding in multiple IF channels the backend system, having each channel a bandwidth of few GHz, so that even processing the entire band, we could deal with about 10 channels for each polarization.

Fig. 1

Representation of the signal should be digital as much as soon as possible in the time of process, for reasons mainly related to reproducibility and reliability. On the other hand the technology at present develop a great number of not too expensive solutions just in the digital domain for the same reason. So we now have a new attribute for this 'broadband' multi-channel backend system, the 'digital’ term. Such approach make the operations to perform math-like or more commonly DSP. The processing unit in principle will be asked to execute a number of fast calculations, such that if it could be possible, a CPU system could be adopted, for the natural more easy and flexible solution to develop a software product. Today, due to the process data rate necessity, such solution is not feasible, but in a future with the CPU speed and processing power evolution, it could be achievable.

So the best we can afford at the moment is a compromise between the fully software, under a standard hardware platform, and the fully hardware solution. We are talking about programmable logic, implemented in the more generic and flexible way we could plan in this moment. Such approach allow to deal with a software development, having less generic steps in hardware. More, the hardware implementation needs to be generic too, due to the evolution and cost of hardware programmable components. So the system architecture needs to be defined at a standard level in its external appearance so to simplify hardware upgrades.

Phase stability improvement is a main issue, and the potentiality to be achieved is in the digital technology, taking care of some defined points: clock distribution, providing the generation is coming from a high stable source, and analog to digital conversion. So particular care has to be dedicated in
these particular sections providing needs for getting the higher degree of precision in the delay determination, at few picoseconds levels.

We can summarize the very general features a 2010 VLBI backend system should behave like:

Number of channels (IFs): between 1 and 10 on each circular polarization
Single channel bandwidth: between 1 and 4 GHz
Data representation: 1 – 4 bit for transport, 1 – 32 bit for data processing
Input: analog or digital IF

1) General Overview with block schematics

Having defined some, almost fixed, points we can go ahead and delineate how to implement the functionality of the single channel processor. The system conceptually is conceived as showed in the fig. 2, where the analog IF enter to be converted in digital and then split in more channels, having different spectral content. This is schematically indicated with the symbols of mixer and filters, even if it can be obtained through different methods.
The channels output is through the standard VLBI interface, VSI-E/H, from which data are ‘injected’ in the network with a dedicated protocol, or in case of remote stations without an high speed connection, in a disk recorder, as MK5 or possible evolutions.

Additional elements making part of the system, are:

- clock generation,
- the clock distribution in the different parts,
- the total power measure facility,
- both in the full bandwidth and in the narrow band sections,
- the autocorrelator for adjusting the band shape,
- cpu for configuration and communication.

2) Analog to digital conversion

The first operation to be performed on the analog IF signal coming from the receiver is to convert it in the digital domain. Such operation need some preliminary signal conditioning, due to the fact that the signal processing requires to operate with a number of bit representation, such to keep at low levels harmonics and inter-modulation products. Such deviation from the usually adopted 1-2 bit representation, introduce the necessity to maintain under control amplitude information in several part of the data process, and in particular in the input stage before the analog to digital conversion.

So a view of this section can be seen in the fig. 3, where the front-end of the entire system is now composed by an analog anti-image filter able to define the frequency band to be converted, with Nyquist requirements, or sampling/conversion methods.

So for example, in case of 1 GHz clock data sampling, the anti-image filter can be defined as 0-500 MHz (more practically it’s not convenient to include very low frequency components, so that we could set 50-500MHz), or 500-1000 MHz. In the former case the Nyquist hypothesis is met, while in the latter the sampling acts as down conversion, folding the band to 0-500 MHz.
System clock generation is to be realized with high phase stability requirements, still maintaining cost not too relevant. A commercial PLO operating at 1-4 GHz could represent the best compromise for getting phase noise levels better than –90,–100 dBC/Hz in the noise frequency spectrum for any frequency greater than 100 Hz.

Important issues are related to the ‘jitter’ (>10Hz) or the ‘wander’ (<10 Hz) in sampling time. In particular this last effect is strongly related to the temperature in the sampling environment, that needs to be stabilized. In case of multi-channel system, as this system is, the wander effect might be calibrated with phase-cal signal, but phase-cal signal is requested to be quite stable in that case.

Total power measurement is required at the input side, pre-conversion, in order to properly set the signal level to be converted, and similarly the total power needs to be measured after digital representation, for data processing and generic use reasons.

Data coming out after conversion has to be transported to the processing units through a fast parallel bus, or an high speed serial path. Evaluations should be performed about the opportunity to place this conversion section in the receiver area, where local oscillator stabilization is very often anyway performed. This could bring to deal with a much simplified phase cal correction, due to the fact that the different channels coming with the same IF, sampled in a unique process, would behave in the same way. Different IFs, and then different analog to digital conversion units, should still be calibrated in phase, just for the individual conversion performance.

3) Two options for band forming, with block schematics

In the first section we discussed about the need to process data in the digital domain as a common calculation process. So in principle whether very fast processors would be on the field, the down conversion and band forming could be realized through a software process, real or near-real time, or maybe faster. Such approach is in this moment not possible, so we indicated as a good compromise could be represented by an hardware solution, as much as possible software too. This maybe strange definition meet its explanation in the programmable logic, in the highly flexible implementation, the FPGA. Such main component represent a valid method to develop a software solution within an hardware environment, without compromise in terms of clock rate and resources. So it represent a very good approach. To have some more ideas, the fig. 4 shown an example on the possibilities offered by the different methods to get similar performance (Xilinx estimation).
So at present it seems mandatory to consider an FPGA based approach, while keeping under consideration to be able and make use of fully software solutions with dedicated processor in a farer future.

The FPGA commercial technology allows in this moment to deal with very high number of gates per chip, including powerful integrated solutions, as multiple processors, gigabit fiber I/O channels, phase locked oscillators, and so on. So in the architecture choice we can be benefit by the freedom to fix the hardware environment and deal with different configuration architectures.

The down conversion to base band, so as the band forming can be realized using different methods. We will concentrate our attention on the two methods proposed for VLBI reasons, pointing out on the main advantages and drawbacks in both cases.

A first approach could be just to follow, with appropriate methodology, what is now performed in the analog approach (Tuccari, 2002). So a schematic view can be seen in the fig. 5.
Digital representation of the IF is modulated so to translate the needed band boundary to (almost) zero frequency, then further process allows to delimit, using low pass digital filtering techniques, the needed portion of band. Making use of appropriate register dimensions, any step can be conceived, and any bandwidth, with the obvious limitations and constrains dictated by the maximum clock rate and costs. So whether clock rate cannot be applied for the high frequency conditions or wide bandwidth, different methodologies have to applied. Several development are in study around the world by different groups, with an intense research work.

This method can be applied up to 32 or 64 MHz bandwidth, with a modification in the filtering performance, and a large modification in the parameters acting as gain control and requantization. For wider bandwidth, ranging from 64 or 128 MHz it could be convenient to adopt a different method, with fixed bandwidth because of the small number of channels needed to cover the entire band. In such case a parallel filtering approach can be applied, with or without frequency modulation.

In fig. 4 is indicated the flow of a complex channel represented as LSB and USB as we are used to treat, but a more convenient way could be to represent data in I/Q format, in phase and in quadrature components, for a simplification. Such modification should of course be reflected into the correlator environment.

Appropriate handling of the NCO, numerically controlled oscillator, could allow to compensate for the fringe rotation. This, together with an additional part controlling the total delay, could represent a valid method to generate a station based observation product. Such data preparation in the station environment would allow to greatly simplify the operations in the correlator side, because of the operation performed singularly, instead of multiple times in a baseline process approach, as now is performed. Moreover, this could represent a very convenient method to share the correlation process between different software correlator sections, allocated in different places. Anyway this option would reduce the field of view, representing in general a limitation for astronomy observations, so that it should be adopted as an option.

A summary for this method to produce the different band channels can be:
- independent channels in terms of frequency tuning and bandwidth;
- narrow and wide bands possible;
- up to 128 MHz clock rate, usable current available methodology as adopted in telecommunication, with a modest development time;
- for higher that 128 MHz clock rates, dedicated solutions to be adopted, with appropriate development time (already in course);
- station based delay and fringe phase compensation possible;
- output as LSB&USB or I/Q samples in time;
- about 6 Mgates required for each channel at the currently maximum possible data rate, including all the auxiliary functionality (this evaluation depends on the current potentiality/number of gates capability and can greatly optimized as soon as FPGA technology evolves).

In appendix 1 is reporta development project regarding the EVN digital back-end that is going to adopt such architecture. It's anyway worth noting as the same hardware project is able to handle different architectures, as for example the one described below.
A second different method has been described (Ferris, 2002) adopting a different approach, able to generate in an ‘ensemble’ more channels. In fig. 6 it is shown the related architecture and in appendix 2 a more detailed description from H. Hinterregger.

A pol- phase front-end filter acts as decimation element and preliminary filter, while an DFT processor contributes to the band forming within a FFT fashion process.

This method presents:

- not independent channel tuning and bandwidth;
- simple implementation;
- narrow and wide channels are possible;
- station based fringe phase compensation not possible;
- output as spectral samples in time;
- number of gates variable with channel dimension, with estimation in course.

4) Auxiliary functionality and CPU management

Additional functionalities are necessary for getting a real useful instrument, ranging from the total power measure to the data transmission. A list is including:

- total power measure in converted channels;
- pseudo noise and notes injection;
- autocorrelation measure for improving band shape;
- phase-cal extraction;
- continuous amplitude calibration;
- digital to analog converter;
- digital AGC;
- gigabit data transfer.

Total power measure in the analog domain in the present VLBI terminals is realized making use of quadratic detectors, designed able to deal with a wide dynamic range, with a much simplified and efficient electronics. Similar approach for such measure should be conceived even in the digital domain.

Pseudo noise generator is useful for injecting noise-like data for calibration and testing in the different part of the backend system, as long as single notes or sweeps. Data are saved on disk and downloaded for real-time reproduction.

Autocorrelation of the output data channel represent the way to know the overall bandpass shape, including analog external filters, amplifiers, cables. So it is possible to derive the filter coefficients in order to correct for these effects.

Phase cal extraction is performed through correlation of processed data with an internally generated pattern, as commonly realized.

Continuous amplitude calibration consists in a weak fast switched noise cal injection, detected synchronously in order to not affect the normal signal receiving. It allows to maintain with a good degree of accuracy the amplitude calibration.

For monitoring purposes a digital to analog converter operating at the output data rate with multibit representation it’s particularly useful to be used with the spectrum analyzer, for testing the equipment or finding lines to be tuned.

Digital AGC is necessary to maintain in a proper range the processed data with respect to the different filters multiplication factors.

Data distribution between analog to digital conversion and data processing, whether the two operations are performed in different places can be realized through a fast serial path. For such functionality 10 Gigabit/s technology can be adopted, including resources available in the last generation FPGA. Moreover, whether a RFI mitigation process is required this methodology can be even adopted for an optional spectrum manipulation in a RFI environment.

We need to point out as the adoption of an optical high data rate serial link could greatly simplify the delay measurement today mostly adopted for taking care of the coaxial cable connections. Indeed great stability is assured with the digital approach and optical fiber transmission: a new back-end could miss the today adopted 'ground unit'.

One ore more CPU are necessary for different reasons we could resume in the following list:

- FPGAs configuration files handling;
- FIR filters coefficients set;
- communication for std. total power measurement;
- settings and data reading for phase cal extraction;
- continuous amplitude calibration readings;
- pseudo-noise, tones, sweep injection;

A useful approach looks to be a commercial CPU board adopting a fast communication bus (PCI-X, or 3G) for an high level management, with a dialogue with an embedded to FPGA processor for the low-level operations. The commercial board should also host the interface between the backend system and the Field System PC.

5) Cost estimation

A complete cost estimation it’s not a simple task, because of the continuous modification of FPGA cost and new resources available at every day reduced price. So we’ll try to imagine a stop in the evolution as at present the technology and price are, not including NRE development costs, trying to have an idea not too far from reality indicating the necessary commercial components. Such an estimation focalize on a system of independent channels, so representing the ‘expensive’ side we cited in the beginning of the document. The price can be estimated for components prices in small quantities, without development, engineering and testing costs, whose contribution could be much variable in terms of financing sources.

The system could be extremely flexible and scalable in terms of number of IFs to be processed, of FPGA modules potentiality expressed as number of gates, number of FPGA modules and auxiliary functionalities as gigabit transmission, DA converter, etc., so that very different prices could be involved for different needs.

Under this assumptions it’s reasonable to estimate a backend system with a functionality of 2-4 IFs with a cost ranging between 25 and 30 K$.

Please note as this estimation is based on the considerations expressed in this document and does not represent any defined cost of developments in course. Moreover very lower prices have been expressed for relevant number of parts produced, with a reduced functionality.
Appendix 1: DBBC EVN Project Development

G. Tuccari

Index:

1) The need for a fully digital VLBI backend
2) Time schedule
3) Project description

1) The need for a fully digital VLBI backend

A description for the Digital Base Band Converter project is reported, showing the general expected performance, an overview of the architecture and a more detailed description of the different component sections. The document is not expected to report operative details, and main reason is to give an explanation about methods and techniques adopted. For a more detailed description an appropriate EVN document series is proposed.

It was reported in the last few years as the VLBI terminal used in our stations every day more obsolete. This together with different other reasons imposed the preliminary study for developing a completely digital system, and the construction of few prototypes for testing the achievable performance. On the other hand e-VLBI and new disk based recorders greatly improved reliability and performance so that processing the received signal with an improved, more robust and predictable process appeared mandatory.

The analog terminals suffer from lack of spare parts and maintenance is expensive, due to particularly high costs in some components (head-stacks, capstan motors, dumper roller, etc.).

The so called ‘Digital Radio’ are becoming familiar within new telecommunication developments requiring frequency conversion, so that as consequence it's convenient to evaluate if the commercial technology is now ready for the VLBI needed performance.

A new generation of stations without a VLBI terminal could even greatly benefit from a new development, taking into account a final cost adequate to a broad use.

The main goal is to replace the existing terminal with a complete compact system to be used with any VSI compliant recorder or data transport and the DBBC project plans to produce as first step four complete prototypes, to be deployed and tested in four radiotelescopes. To keep the cost limited commercially available components (no custom device) will be adopted.

Hardware programmability is the main feature in order to optimize the architecture to the needed performance, so that maximum input and output data rates are the unique limitation and they are set so to satisfy the present and a reasonable future necessity.

The new development needs to be fully compatible with the existing terminals and correlators, still considering that the new set of BBCs must be fully up-gradable and ready in a future needing to process larger bandwidth with modified correlators. At the same time the introduction in the stations has be as much as possible ‘soft’.

Upgrade or improvements should be mostly only software even if it should also in principle be possible in hardware replacing compatible ‘pin-to-pin’ processing modules.
2) Time schedule

<table>
<thead>
<tr>
<th>TASK</th>
<th>TIME</th>
</tr>
</thead>
<tbody>
<tr>
<td>Coordination and logistics</td>
<td>2004.5 2005.0 2005.5 2006.0 2006.5</td>
</tr>
<tr>
<td>A/D-D/A Converter Environment</td>
<td></td>
</tr>
<tr>
<td>1024 MHz Synthesizer</td>
<td></td>
</tr>
<tr>
<td>Demultiplexer 2:8</td>
<td></td>
</tr>
<tr>
<td>Core Module Board</td>
<td></td>
</tr>
<tr>
<td>FPGA: NCO and mixer</td>
<td></td>
</tr>
<tr>
<td>FPGA: FIR autocorr.</td>
<td></td>
</tr>
<tr>
<td>FPGA: Total Power</td>
<td></td>
</tr>
<tr>
<td>PCI Interface</td>
<td></td>
</tr>
<tr>
<td>VSI Interface</td>
<td></td>
</tr>
<tr>
<td>Linux PC Board and FS</td>
<td></td>
</tr>
<tr>
<td>Packaging and mechanics</td>
<td></td>
</tr>
<tr>
<td>10Gb/s TX-RX opt. mod.</td>
<td></td>
</tr>
<tr>
<td>RFI Mitigation opt. module</td>
<td></td>
</tr>
<tr>
<td>On Field Testing</td>
<td></td>
</tr>
</tbody>
</table>

Milestones:
- First version of board available
- Final version of board available
- First developing prototype observative test
- First complete prototype observative test
3) Project technical description

3.1) DBBC General Features

The main general features as defined at this stage for each DBBC unit are:

• Four IF Input in the range 10-512 or 512-1024 MHz
• Four polarizations or bands available for a single group of output data channel selection
• 1.024 GHz fixed frequency sampling clock
• Channel bandwidth ranging between 250KHz to 16 MHz (preliminary)
• Tuning step 50 KHz (preliminary)
• Multiple architecture using fully re-configurable FPGA Core Modules
• Modular realization for possible cascaded processing
• Field System support
• Data out as single or double VSI interface
• Total power measurement capability
• Continuous Tsys measurement capability
• Autocorrelation function for improving band shape
• Pseudo noise and notes injection
• Digital to analog converter monitor output
• Digital AGC
• Optional gigabit data transfer

3.2) DBBC General Schematic View

The diagram shows an overview of the main components in the system. One IF is showed, being all the four sections identical.
The IFs coming from the receiver after a spectral limitation and amplitude conditioning are converted to a digital representation at a fixed data rate. The simplest configuration then presents a demultiplexer stage able to reduce the clock frequency, at a wider data representation expense. Further process is performed by the so called Core Modules where data are processed to generate 32 or 64 channels to be transferred through one or two VSI interface to MK5B units or, adopting the VSI-E (in definition) protocol interface, to a gigabit Ethernet connection.

A PC under Linux is managing the system, under different aspects: programmable logic configurations and registers setting or reading.

A more flexible implementation provides 10G channels data transfer, whether the analog to digital conversion is realized at the receiver site, with the possibility to interpose additional processing units, such as RFI mitigation units, or other.

3.3) DBBC Core Module

The Core Modules units are in charge for processing the data so to perform the required different functionalities. Main elements are:

• A single module able to process more channels

• More modules can be cascaded

• Three external buses: HSI, HSO, HSC

• HSI Input data bus is propagated with low skew

• HSO Output data bus is shared for multiple IF access

• HSC Control/Configuration bus

• HSX Internal data bus

• Different configuration can be supported:
  (example)
  SSB down converter
  Wide band parallel FIR
  Polyphase FIR / FFT

• A module able to handle:
  Maximum Input bandwidth 8.192 Gbit/s
  Maximum Output bandwidth 4.096 Gbit/s
  Control/Configuration bus PCI-X or 3GIO(1x) if available

• Different modules with different number of gates are supported for different functionalities and costs

• Standard Core module up to a maximum of 24 Mgates
• A module with 24 M gates can handle up to 4 independent narrow band LSB&USB channels (*preliminary evaluation*)

• A module with 24 M gates can handle up to 4 wide band channels (ex. 1X512, 2x256, 4x128 MHz) (*preliminary evaluation*)

3.4) System Components

The elements composing one DBBC system are listed with the aim to give a possible structure for a job sharing:

**• Analog to digital converter ‘environment’**

**• 1024 MHz Synthesizer**

**• Demultiplexer 2:8**

**• Core Module Board**

**• FPGAs Core Configurations**

**• PCI interface**

**• VSI-H-S-E**

**• Linux PC Board: System Management Software**

**• Field System Integration**

**• Optional Modules: 10 Gb/s serial link, RFI Mitigation processor**

1.6) Analog to digital converter ‘environment’

The front-end module is in charge for translating the analog signals as received and down-converted in the IF range 0-512 or 512-1024 MHz to a digital representation for the further process. Main components for this unit are:

**• Conversion Clock 1024 MHz**

**• MAX108 analog to digital converter**

**• Front-end power level control**

**• Bandwidth 10-512 / 512-1024 MHz selection/filtering**

**• Pre- and post-conversion total power measurement (preliminary)**
• AD temperature stabilization
• LVPECL level data bus
• PCI interface
• 1024 MHz Synthesizer

A schematic view is showed in the following diagram.

3.5) 2:8 Demultiplexer

Data rate coming out from the A/D unit is running to a higher frequency clock with respect to what the processing units are able to run, so that it is necessary to reduce it. This reduction is realized adopting a parallel computing strategy, so that it's necessary to make use of a demultiplexing scheme. Such functionality is shown in the figure.

3.6) Core Module Board

The processing unit, defined Core Module is a general purpose multi FPGA board, presenting the main features:
• HSI Cascade-able Input Bus 8x8bit @128MHz
• HSO Shared Output bus 2x32bit @32,64MHz
• HSC Control / Configuration bus 32bit
• HSX Internal data bus 128bit @max128MHz
• Maximum 4 FPGA VirII-1152pin
• 1 FPGA VirIIPro for low level computation
• PCI interface
• ‘Sandwich’ cascade method

A schematic view for the data bus architecture is shown in the figure.
3.7) Digital Down Converter Configuration

Main features:
- Direct conversion typically between high data rate sampled IF band and lower data rate base band
- LO as a Numerically Controlled Oscillator
- Mixer as Complex as Look Up Table multiplier
- Low-pass band filter Finite Impulse Response (FIR) filters cascade
- Decimation because of the high ratio between IF and output data rate performed with multirate/multistage FIR
- Digital Total Power measurement at IF level
- Digital Total Power measurement at base-band level
- Gain Control
- Narrow bandwidth: 16, 8, 4, 2, 1, 0.5, 0.25 MHz
- Wide bandwidth: 512, 256, 128, 64, 32 MHz

3.8) Narrow Band SSB Down Converter Configuration Elements

- Parallel demultiplexed 8 data buses flow
- Output clock 32 MHz
- Parallel Pre-computed Oscillator (PPO)
- Multistage FIR filtering
- Hilbert Transform filter
- Gain control
- Inverse distribute FIR typology
- Digital total power meter

3.9) VSI Interface

- Two units will be used
- Data clock 32-64 MHz
- Input has to be compatible with differential positive logic
- VSI – E compatibility

3.10) Linux PC Board: System Management Software

- Standard commercial PC board including HD
- Configuration files for each FPGA stored on HD
- Software interface for FPGA configuration
- Software interface for servicing FPGAs (I/O registers access)
- Software interface for A/D level control
- Software interface for VSI interface (DOT clock and mode selection)
1.0 Introduction

Digital signal-processing technology has now progressed to the point where a fully “digital back end” (DBE) has many advantages, including both price and performance. In this brief note, we examine one possible concept for a flexible DBE and examine its probable performance and cost.

2.0 Goals of the DBE Design

The goals of the DBE design presented in this document can be summarized as follows:

- **Performance**
  - Improved RFI tolerance, detection and avoidance compared to analog systems
  - Reduced spurious harmonic and inter-modulation response compared to analog systems
  - Absolute stability and repeatability
  - Increased channel bandwidth and, potentially, ability to handle increased IF bandwidth

- **Implementation**
  - Use of FPGA-based design for ease of fixes, upgrades and flexibility
  - Based on industry-standard COTS components

- **Cost**
  - Reduce per unit reproduction cost to ~$10K/system (compared to >$100K for current analog backend systems)

3.0 Basic DBE Concept

Figure 1 illustrates the basic DBE system which we propose to use as a building block for a VLBI DBE. We will follow the signal through the chain, using specific illustrative parameters that are appropriate for VLBI:

1. A standard analog bandpass filter, also known in this application as a ‘single-Nyquist-zone’ filter, selects the portion of the total IF band to be processed. In this example, the 512-1024 MHz portion of the input IF signal is selected.

2. An analog-to-digital converter operating at the Nyquist rate of 1024 MSamples/sec digitizes the signal with 8 bits/sample. Sampling at 8 bits/sample gives the output of the A/D converter sufficient dynamic range to deal with RFI as much as ~36dB above the total receiver noise. The total data rate at the output of the A/D is 8192 Mbps.

3. A poly-phase FIR (finite-impulse response) filter followed by an FFT of $2^k$ points separates the signal into $2^k$ frequency channels. In this illustrative example, we have chosen $k=12$, so that the 512 MHz total signal bandwidth is separated into 4096 frequency channels of 125 kHz, each represented by 250 kSamples/sec at 8 bits/sample. The total data rate from the FFT module is still 8192 Mbps.

4. Any 125kHz-bandwidth channels that have objectionable RFI may be excised (i.e. set to 0), then the signal is re-synthesized to a new set of channel bandwidths. In this illustrative
example, we show re-synthesis to 32 channels of 16MHz bandwidth each; all 32 channels are simultaneously available and any or all may be chosen as part of the VLBI observation. Normally, the number of bits/sample will be re-quantized to 2 before recording, bringing the total available data rate for this example to 2048 Mbps if all 32 channels are selected.

Comments:
Due to the nature of the digital processing in this concept, there is no ‘offset LO’ available, which is often used to place phase-calibration tones at frequencies that are not at integer multiples of 1 MHz. Any such offset must already be inherent in the applied IF signal; this implies that the first LO system must be capable of providing the necessary offset, which is typically 10 or 20kHz. Such an offset first LO system has been proposed by Alan Rogers, but has not been constructed. On the other hand, the improved linearity of the DBE may render an LO offset unnecessary.

The number of points in the FFT modules must be a power of two. If a modern FPGA is used in the implementation, the selected power of two in each of the FFT modules may be configured into the FPGA at the time of the experiment. This allows various choices for the investigator.

The FFTs may be configured to impose a frequency shift of all output channels by ½ of the channel bandwidth. This may be useful in some circumstances to window spectral lines or, along with another DBE system, to provide a redundant frequency-shifted set of channels to eliminate seams in a reconstructed broadband spectrum.

4.0 Suggested DBE System for VLBI

Figure 2 shows a suggested block diagram for a VLBI Baseband Output Module (BBOM) with four IF inputs, which is typical of modern VLBI systems with two dual-polarization frequency bands. As you can see, the BBOM consists of four basic-DBE systems illustrated in Figure 1; the BBOM is provided with a common Reference Frequency (from which the sample clock is generated) and common 1PPS timing signal. A ‘Select’ module at the end of the signal chain allows any of the output channels from the four DBEs to go to standard VSI-H outputs or transmitted on multi-Gbps serial outputs. If the output is re-quantized to 2 bits/sample, as would normally be the case, a total of 8192 Mbps is available at the output of the BBOM.

A significant advantage of this conceptual approach is that the entire bandwidth of the four 512 MHz IF channels is simultaneously available, from which the user may choose any subset.

5.0 Implementation & Cost
The DBE block diagram in Figure 2 suggests dividing the DBE into two types of sub-assemblies, an IF Input Module (IFIM) and a Base-Band Output Module (BBOM).

5.1 IFIM Module
The IFIM digitizes the incoming IF signal and creates a high-speed serial output data stream to send to the BBOM. The IFIM consists of three sub-modules:

1. **Bandpass Filter:** The IF input must be band-pass filtered in order to select a Nyquist-zone and adequately reject signals outside it. Nyquist zone edges are spaced at ½ sample-rate intervals in frequency. Only standard VLBI sampling-rates ($2^N$ Msamples/sec) are being considered for DBE implementation, perhaps up to 4096 MS/s in the 2006 to 2010 time frame. Thus, specification of the fixed analog IF-filter is tied to the choice of one of these sampling-rates and, in practice, choice of the 1st or 2nd Nyquist zone (the latter rejects harmonics but makes
sampling-clock and A/D aperture jitter more critical). Expected filter cost: < $50 in small quantities.

2. **A/D Converter**: A high-speed nominal 6-or-8-bit A/D will be selected to sample the properly Nyquist-zone-filtered analog IF input signal directly. A good candidate, just introduced, is the National ADC081000, an 8-bit 1.9v CMOS converter rated for up to 1.6 GS/s and 1.6 GHz input frequency. It consumes 1.4 W and costs $100, ¼ the power and 1/6 the cost of Maxim MAX108, until now the only COTS device with comparable performance. Faster and dual versions should be available in 2005. Thus, a 2006 implementation can safely target 1024 MS/s operation, and 2048 MS/s may well be practical by then. Low-cost devices for 4096 MS/s service, perhaps with integrated multi-Gbps serial outputs, are likely to be available by 2010 for 2nd generation IFIM designs.

3. **A/D Interface**: The A/D outputs need to be interfaced to the BBOM in a standard manner, to be defined, which includes electrical, mechanical, and protocol specification. We suggest conversion from today’s typical 8-bit/sample, 1:2-sample de-multiplexed, parallel A/D outputs (16 LVDS lines at 512 Mbaud) to the XFI-standard 10 Gbps serial format, which is compatible with standard XFP plug-in transceivers. A ‘platform’ FPGA (Xilinx XC2VPX20) can readily provide both 1) the parallel A/D interface, and 2) up to 8 ‘lanes’ of high-speed 2.5-10 Gbps serial interface (to the BBOM). Cost of this ‘future-proof’ chip is expected to be < $250 in small quantity in 2006.

5.2 **BaseBand Output Module (BBOM)**

The BBOM accepts the four digitized IF streams and, for each, performs the user channelization through a Filter Bank, as indicated in Figure 2. The ‘Select’ module then selects which of the digitized baseband channels are to be provided to the user; output to the user is available either in standard VSI-H format or as a simple serial version of VSI-H (perhaps using XFP plug-in transceivers). Additional functions of the BBOM include:

1. Critical timing: (a) generation of Sampling-Clock (SC) from a Reference Frequency input and distribution of SC to IFIMs, and (b) synchronization to 1PPS of IFIM, Filter Bank, and BBOM operations
2. FPGA configuration
3. Control
4. Monitor (sample and spectrum statistics, total power, etc.)
5. Communication and management (PC via a USB port?).
6. Possible additional functions may include, NCO-driven (multiplexed) complex mixers for 1) Baseband phase-cal tone extraction, 2) Frequency offset, and/or 3) Doppler compensation.

The Filter Banks are most efficiently and flexibly implemented in FPGAs using Poly-phase FIR/FFT algorithms. Fully characterized FFT IP-cores are available from Xilinx. Several other groups, notably Berkeley in support of ATA and CSIRO in support of ATCA, continue to develop PP FIR cores that may be adapted for DBE Filter Bank design. A high-speed 1024 MS/s = 8192 Mb/s serialized IF sample stream received by a Filter Bank FPGA must first be internally de-multiplexed so that, for example, 8 samples are processed in parallel with a 128 MHz Filter Bank clock (in this case 4 copies of a complex FFT core are needed). An XC2VP40 or two VP20s should suffice for a relatively simple 64 to 1024-point Filter Bank; a VP70 may be needed for a more complex version with higher resolution, RFI excision, and wideband re-synthesis (inverse transform) added capabilities. In 2006, the cost for
the Filter Bank FPGA capable of processing one IF data stream is expected to range from $200 to $800, depending largely on complexity of design and whether 1024 or 2048 MS/s sample-rate is to be supported. Due to the variety of parameters that may be chosen for a Filter Bank, consideration might be given to designing it as a pluggable sub-module.

In addition to the four Filter Banks, the BBOM is estimated to need at most 1 XC2VPX70 to perform its necessary duties: FPGA cost is expected to be < $600 in 2006.

5.3 Baseband Output Module (BBOM) Replication Cost Estimate in 2006:

IFIM: < $500 each
Filter Bank: < $500 each at 1024 MS/s
XFP optical transceiver: < $500 each for 10 km range
BBOM: < $1000 w/o IFIM modules, Filter Bank modules or XFP transceivers
Chassis w/ power supply: < $1000

2006 DBE Total: < $8000, for a ‘full’ 1 BBOM system (including 4 IFIM’s, 4 Filter Banks, and 4 XFP standard plug-in optical transceivers).
Analog BP filter

'Single Nyquist-zone' filter

1024 MS/s 8 b/S

Example: 2nd zone

Poly-phase FIR

FFT ($2^k$ pts)

RFI Excision & Re-Synthesis

8192 Mbps

Figure 1: Basic DBE Concept
Figure 2: Suggested block diagram for Baseband Output Module (BBOM)