Science Data Processor

SDP logoArchitecture

Along with other SKA consortia, SDP is working towards implementing Software Engineering Institute (SEI) approaches to its software architecture documentation. This style is useful for promoting architecture decision processes based on concrete “quality attribute” scenarios. These methods will help us with addressing the many architectural trade-offs that SDP faces – balancing scientific accuracy with long-term modifiability and performance.

In addition, we will be looking into structuring documentation around the SEI “Views and Beyond” approach. This includes methods to make it easier to target specific audiences with relevant documentation. This will help make the SDP architecture easier to navigate for readers from outside the SDP. This will increase in importance as we move towards construction.

At this point we have had two joint workshops to increase knowledge of the SEI processes and terminology, one in January at SKA headquarters in Jodrell Bank and another in Cambridge in March specifically addressing SDP considerations.

Performance Model

A performance prototype platform (P-cubed) is being built at the West Cambridge Data Centre to support SDP prototyping activities. This is funded by a grant from the STFC in the UK. Service support is provided by University Information Services at Cambridge, with whom the SDP project works quite closely.

The platform will comprise a number of compute and controller nodes, high performance networking as well as a variety of storage sub-systems. Together these will support horizontal and vertical prototyping activities exploring various interfaces of the SDP Compute Island as well as the ability of SDP to adopt a standard Big Data execution framework.

The P-cubed middleware will comprise of an upstream OpenStack software defined environment to provide a flexible environment for prototyping.

The hardware was selected in November and is in the process of being commissioned. The system will be available by the end of this month.

SDP Image 1One of the P3 equipment racks hosted in Cambridge

Parametric Model

The SDP Consortium has continued to engage with the SKAO to develop system sizing and costing methodologies. The SDP hardware system sizing, as in the Preliminary Design Review approach, is based on parametric models of computational needs for different High Priority Science Objective (HPSO) experiments and combined to estimate the average system size required (in terms of delivered computational rate). The processing (covering all the experiments in appropriate proportions) over a few days is modelled to see how much unprocessed data would sit in the SDPs buffers.

The model outputs then allow us to make estimates for the buffer size required to enable suitable flexibility in scheduling whilst tensioning that against the cost of the additional buffer.

A series of staged SDP deployment scenarios have been analysed in which a fractional SDP system (e.g. representing 20%, 40%, 60%, 80% of a full system) is procured at the time of the final array release, with this smaller system being torn down and replaced with a full size system after 1, 2, 3, 4 or 5 years.

Our work on the parametric model for compute load will continue in the coming months during which we will also work with the Telescope Manager consortium and SKAO to finesse the astronomy scheduling model – the buffer size work must be tested against a new scheduling model (covering a longer period), and incorporate the size of the SDP staging area where the products of SDP pipelines will sit before transfer out to the SKA Regional Centres.

Integration Prototype

The SDP Integration Prototype (SIP) is designed to allow the SDP to prototype its external and internal interfaces (i.e. where different bits of the SKA and SDP join together). Its aim is to provide a proof of concept that the SDP can work – that it is possible to collect data from the CSP, process and archive it, and deliver it to astronomers. It is not concentrating on SDP performance (there are other prototyping activities, including activities with industry, which examine this aspect), but the SIP team will be one of the major users of the SDP Performance Prototype Platform. This is because many of the internal interfaces SIP want to test can be implemented using OpenStack, the Infrastructure-as-a-Service system used on the P^3 prototype (for example OpenStack provides ways of managing messaging, workflows, and file- and object-stores). These are all things that the SDP needs, so SIP will be evaluating whether these things do the things that the SDP needs.

The interface between SDP and TM is very important – it’s how the SDP gets its information about what it is that’s being observed by the telescope. The SIP now has a working SDP controller, using TANGO (the control system chosen by the SKA). Below is a diagram showing the code classes used by the SDP Master Controller. The Master controller then can create and monitor ‘slave controllers’, which is where the main work of the SDP is done. These slave controllers report on the status of the tasks that they’re running. These tasks can be quite big – all of the real-time processing for one observation scheduling block would count as a ‘task’ to be managed by a slave controller. The Master Controller can get status updates from the slaves, and report that back to the SKA Telescope Manager when needed.

SDP Image 2 ClassDiagram_masterMaster Controller class diagram

This system will allow the SIP team to test out different Execution Frameworks (the part of the SDP that does the main work of image processing), see how they interface to the rest of the SDP system, and how these frameworks might be managed by the SDP Master Controller. This will be a light-touch, loosely coupled management: it’s the equivalent of a human manager checking in with a worker and seeing that they’re OK and have everything they need, rather than standing there every second and micro-managing. This means that both the Master Controller and the Execution Framework are more resilient – if they lose contact for a couple of seconds (or perhaps longer), a loosely-coupled system means that this doesn’t stop the other component from working. The SIP team will be testing out this interface between the Master Controller and the Execution Framework to ensure that it has these properties.

As part of the image processing, SDP has to circulate information about the calibration of the telescope. This information is calculated during image processing, on different bits of the data, but this information then has to be circulated so that the whole of the image processing pipeline can use it. SIP are going to test out the Redis distributed database to see whether it can meet the SDP’s requirements.

Naturally, this can’t be completed overnight; we’re planning a series of releases of the SIP code over the next six months to support this work. And of course the SIP team aren’t just writing code to do this, they’re also creating tests (so we can check that the code is working correctly, and that it’s not been broken by any changes) and documentation (so the rest of the SDP can understand the work done by the SIP team).

Report provided by the SDP consortium