The “Central Signal Processor” (CSP) Consortium is comprised of 13 signatories from 8 countries with more than 10 additional participating organisations. The Consortium includes a rich mixture of engineers, scientists and managers from various academic institutions, industry and government labs spread over 5 continents (see https://www.skatelescope.org/csp/ for more details). As might be expected, it has been a challenge to proceed efficiently with such a diverse and distributed team.
The lead organisation of the Consortium is the National Research Council of Canada (NRC). NRC has contracted MDA Systems Ltd. (MDA) to assist in leading the Consortium.
What are we designing?
The CSP Element includes design of the hardware and associated firmware/software necessary for the generation of visibilities, pulsar survey candidates and pulsar timing data from the telescope arrays. More background on the CSP can be found in the previous eNews submissions: http://newsletter.skatelescope.org/category/pdf-version-of-enews/
Current Status of Design Activities
Since the last eNews submission in August the CSP team has completed another round of costing and participated in the Engineering Meeting in Stellenbosch, which included a CSP Consortium day of meetings (Figure 1). The team has been updating the ICDs and progressing the requirements to support the System PDR and to pave the way to CDR.
The sub-element design teams have continued to progress with detailed design and prototyping focussing on the most promising architectures and technologies. There has also been much activity on the system engineering side (requirements, ICDs, modeling, processes, standards, ILS/RAMs) with contributions from New Zealand Alliance, STFC, CSIRO, ASTRON, Oxford, AUT, University of Manchester, NRC, Swinburne, and MDA. There are still challenges in finalising the Level 1, 2, and 3 requirements that are required for the efficient progression to CDR.
Key Sub-element Design Development
Local Monitoring and Control (LMC)
The CSP Local Monitoring and Control (LMC) Sub-element is responsible for coordinating all the CSP processing functions according to commands from the Telescope Manager (TM), returning status rolled-up from the various processing sub-elements, and configuring and sequencing the sub-elements. This sub-element is being led by NRC with assistance from NCRA and INAF. The CSP LMC team is actively supporting SKAO-led initiatives to define SKA standards and guidelines for implementation of the monitor and control system and the SKA software engineering process. The CSP LMC team is leading the effort on the definition of states, modes, commands and configuration and contributing to the definition of the design patterns for generation and handling of logs and alarms. Significant progress has been made on the definition ofinterfaces. The INAF team is developing a prototype based on the current version of the SKA Control Systems Guidelines.
LOW Correlator and Beamformer (Low.CBF)
In early September a majority of the team travelled to Hong Kong for what was called the System Refresh Meeting – the aim of this meeting was to refresh the team on the latest thinking across all areas of the project (technical, management and system engineering). The meeting was well attended by all collaborators with 19 in total from CSIRO, ASTRON, AUT and the SKAO. The team celebrated its first year together as a team and the progress made to date with a cake (Figure 2). There was more to the meeting than cake – there was also a shirt (Figure 3)! The team endured almost 40 hours of presentations during the week on all aspects of the system which generated much discussion and progress. A second revision of the Gemini hardware was also presented which makes the transition of locating FPGA boards in a 1U chassis to a 4U card/subrack system. This improves many operational aspects of the system.
Five Low.CBF team members also attended the SKA Engineering Meeting in South Africa – which allowed much progress to be made in the area of interfaces between elements (Infrastructure, Pulsar Search/Timing, SDP and LMC). The thought of building a common hardware platform for CSP Low and Mid correlator beamformers was also discussed and explored in depth for a month after the meeting. Unfortunately each instrument has many different optimisation goals which resulted in a less optimal solution (risk, power and cost) for Low.CBF. Much was learnt from this and made a clearer path for the sub-elements to follow. One of the common parts was the adoption of High Bandwidth Memory (HBM) as a replacement to Hybrid Memory Cube (HMC) memory on the first Gemini hardware. This change in memory technology enables a lower power solution and reduces the effort to implement memory interfaces.
During the past four months the Low.CBF sub-element has made many advances in software, hardware, system engineering and planning. Although even with the progress it has been a difficult few months as there have been many requirement updates and major design changes. We feel that the design is getting better with each iteration and it is converging on a solution. All this needs to be captured in our documentation now. A pre-CDR meeting has been organised for December 7-14th 2016 in Canada – this meeting is preparing the team for pre-CDR submission in March 2017. The location is near the Dominion Radio Astrophysical Observatory (DRAO) which will enable several important face-to-face discussions (LMC, hardware, system engineering) with our Canadian colleagues working on Mid.CBF. The next four months will be very exciting and the team is looking forward to putting together a solid pre-CDR submission.
MID Correlator and Beamformer (Mid.CBF)
The Mid.CBF Sub-element is led by NRC and is based on a Stratix 10 FPGA solution. This is a joint effort with MDA, NZ Alliance, UPM Spain, and Selex ES (now Leonardo). Six members of the Mid.CBF team travelled to the 2016 SKA Engineering Meeting in Stellenbosch. Progress was made on finalising ICDs with other elements and many ideas were discussed within the larger CSP team. The Mid.CBF team left with new ideas that have been incorporated in the system design and with a clear path towards sub-element and element CDRs.
The Mid.CBF team has gone through a design optimisation process to decrease complexity, risk, power, and cost. As noted above, this involved working closely with the Low.CBF team for a period of time. The result is a design with fewer custom hardware modules and Line Replaceable Units (LRUs), with the previous motherboard/mezzanine/IO boards replaced with a single board design (Figure 4.). The Mid.CBF team held a workshop in November at DRAO to review the latest design and plan the way forward.
The Mid.CBF system will be based on two processing LRUs; the TALON-MX and TALON-SX. The TALON-MX is based on a Stratix 10 FPGA containing 16GB of High Bandwidth Memory (HBM) and the TALON-SX is based on a Stratix 10 FPGA without HBM, but with more signal processing resources. The two LRUs will be identical except for the FPGA. Hardware design is underway for the TALON-SX LRU and design of the TALON-MX LRU will begin in early 2017. The TALON-MX/SX LRUs are blades that will fit vertically in a 4U sub-rack, with each sub-rack containing 12 LRUs. A backplane in the sub-rack will provide power, liquid cooling (Figure 5.) and optical connections.
The Mid.CBF team has also progressed the software architecture for monitor and control. Mid.CBF will make use of the quad-core ARM A53 processor systems included on each Stratix 10 FPGA. This approach will allow TANGO and Linux to be running directly on each FPGA and will greatly simplify monitor and control software. The Mid.CBF team has purchased two Arria 10 SoC development kits in order to develop and prototype the use of the embedded ARM processors. These activities will allow the Mid.CBF team to test the monitor and control infrastructure in parallel with Stratix 10 based hardware development.
Other activities underway include:
- Studies on the use of non-water liquid cooling approaches within the screened KAPB.
- The Mid.CBF FPGA design, sizing and prototyping of the highest risk IP blocks.
- Models are also being developed to confirm the algorithms and predicted performance.
The next year will be exciting and challenging as the team prepares a detailed and comprehensive CDR submission.
Pulsar Search Engine (PSS)
The Pulsar Search Engine is a large sub-element of the CSP to search for pulsars and fast transients that will have almost identical instances for both SKA-mid and SKA-low.The design team is led by the University of Manchester, University of Oxford and the Max Planck Institute for Radio Astronomy supported by input from INAF Italy, NZ Alliance, ATC Edinburgh, and ASTRON.
As part of our ongoing prototyping effort the PSS team conducted tests to study the energy consumption pattern on the first protoNIP server located in Cape Town. A team (Gumede et al) from SKA South Africa supported this activity by providing and configuring the energy-meter. The energy-meter was configured to capture voltage and current readings from the dual power supplies of the server. While activating GPU and FPGA accelerators and network-stack-cpu in the server with representative programs, fine time resolution recordings were made in the energy meter. During the study we have also gathered energy consumption patterns corresponding to the power-on/off cycles in the server. The measurements obtained provide useful feedback to cross verify any theoretical estimates.
To verify the proposed software architecture for the PSS of SKA1, a complete version of the time domain acceleration search (TDAS) software module was implemented in the Cheetah/Panda framework. Here, we have used the Thrust parallel algorithms library to quickly prototype and deploy a GPU-based processing pipeline, capable of searching timeseries for accelerated pulsar signals. In brief, such a search consists of resampling a timeseries to a range of accelerations each of which is subsequently fast Fourier transformed. The resultant fluctuation spectra are then harmonically summed (a process in which we try to recover power that is distributed throughout a given signal’s harmonics) and statistically significant signals are recorded for further investigation. This prototyping exercise has allowed us to test the Cheetah/Panda framework in real-world conditions on real-world data, an invaluable step towards verifying the PSS design.
As part of this process a version of the SIFT module has been completed and implemented in the Cheetah processing pipeline. The SIFT module groups harmonically related signals from pulsar candidates detected during pulsar search operations. By grouping related signals, only unique detections are kept, which greatly reduces the total number of candidates that will be sent on for further processing and science analysis. Unit and functional tests of the Cheetah module have been run and passed successfully. The inclusion of SIFT in Cheetah is one of the final steps in the implementation of a prototype end-to-end Cheetah pulsar search pipeline.
Pulsar Timing Engine (PST)
The Pulsar Timing Sub-element will perform high-fidelity, high-precision timing observations of known pulsars for both SKA-low and SKA-mid. The primary computational task performed by this instrument is phase-coherent dispersion removal, which requires performing many large Fast Fourier Transform (FFT) operations in real time. The PST Sub-element design is based on COTS hardware with GPU accelerators, and an early version of this solution is currently being commissioned at the MeerKAT telescope. Since our last update in August, we have been working closely with the CBF design teams at CSIRO and NRC to complete the Interface Control Documents (ICDs) that define the interaction between the Low and Mid beam formers and the PST. In collaboration with NVIDIA, we started benchmarking our prototype software on an IBM platform that hosts 4 Pascal P100 GPUs. Preliminary results indicate that our design will meet all functional requirements. We have also continued our detailed analysis of the proposed inversion of polyphase filterbank output using high-resolution FFTs; this technique will reduce cost and enable the PST to recover time resolution following frequency-domain beamforming. Our research effort on this front will be reported and submitted to a refereed journal in 2017.
Path to CDR
Overall, the CSP Consortium has made good progress since August. The focus is to “freeze” the requirements and ICDs to support efficient progression to CDR. There is lots of work to do to get to CDR but the team is up to the challenge.