UK air traffic woes caused by 'invalid flight plan data'
Former BA boss slams resilience, says explanation 'doesn't stand up from what I know of the system'
Mystery still surrounds the technical issue at the UK's National Air Traffic Service (NATS) on Monday, which is being blamed on incorrect flight plan data being received, leading to the system reverting to manual processing and causing delays and cancellations of flights.
You'd might be thinking the system and its operators would reject malformed or incorrect input, and continue on as normal without drama. The Register understands that invalid flight plan messages submitted to the Flight Plan Processing System are automatically detected and collect in the invalid queue. At this point, we understand, air traffic control staff have responsibility for processing the invalid entries by manually correcting them in a first come, first served basis, suggesting that such issues are not uncommon.
It is conceivable that the problems seen on Monday may stem from human error when attempting to correct an invalid flight plan message. We asked NATS if this might be the case, but we did not receive an answer.
What NATS would say is that the ensuing IT meltdown on Monday caused thousands of flight cancellations and delays across the UK, with NATS CEO Martin Rolfe telling the BBC on Wednesday that the situation was unusual and "should not happen again."
But the precise nature of that problem has yet to be fully explained, and may not come to light before investigators deliver a preliminary report to the Secretary of State for Transport next Monday.
In a statement posted on Tuesday, NATS disclosed that early investigations pointed to flight plan data that was received from an airline and which could not be processed for some reason, leading to all automatic processing to cease. There are no indications that this was caused by a cyber-attack, the organization said.
Some of the UK's more excitable tabloid media outlets have already reported that a French airline's flight plan submission may be to blame.
"Our systems, both primary and the backups, responded by suspending automatic processing to ensure that no incorrect safety-related information could be presented to an air traffic controller or impact the rest of the air traffic system," Rolfe said in the NATS statement.
Suspending automatic processing meant that flight plans had to be input manually by air traffic control staff, who were unable to handle the volume usually processed by the automatic systems, leading to traffic flow restrictions being imposed.
"Very occasionally technical issues occur that are complex and take longer to resolve. In the event of such an issue our systems are designed to isolate the problem and prioritise continued safe air traffic control," Rolfe added. "At no point was UK airspace closed but the number of flights was significantly reduced."
Tim Jeans, formerly managing director of Monarch Airlines, told the BBC's Today program this morning: "There are four hours of flight data stored during the course of any working day, so if there is a fault, basically NATS has four hours to fix it. Now clearly they weren't able to fix it in four hours, the airlines will be rightly asking 'Why isn't it four hours or eight hours or 24 hours?' That's the service they're paying for."
About 1,100 flights were canceled with hundreds of thousands of travelers affected.
Rolfe responded that the four-hour time frame was down to the airlines, which only load their flight plans four hours in advance.
Travel chaos continues
NATS said that the glitch was dealt with and all of its systems have been running normally since Monday afternoon in order to support airline and airport operations as they recover from the incident.
There was still disruption on Wednesday, however, with many UK travelers abroad still reportedly not able to get alternative flights following Monday's cancellations.
Rolfe told the BBC that measures were now in place to protect against the "incredibly rare" system failure, and should it happen again, it can be resolved "very, very quickly."
Resiliency – we've heard of it
A spokesperson for the UK Strategic Aviation Special Interest Group (SASIG) told us it was concerned that the supporting computer systems at NATS are so stretched that if they should fail, even for a relatively short period, the disruption is enormous.
- Lufthansa flights grounded by major IT snafu, 'construction work' blamed
- Airline 'in talks' with Kyndryl after failed network card grounds flights
- A few reasons why cops didn't immediately shoot down London Gatwick airport drone menace
- British Airways flights grounded due to glitch in flight planning app
"This suggests a lack of resiliency provision and raises the question whether NATS regard the costs of such as being too great and that because they do not have to bear the costs of disruption it's a risk they are willing to take."
Willie Walsh, Director General of the International Air Transport Association and former boss of British Airways, said in a statement that NATS had crucial questions to answer about their responsibility for what he called a "fiasco."
"The failure of this essential service is unacceptable and brings into question the oversight of the CAA (Civil Aviation Authority) who are required to review the NATS resilience plan under the terms of its licence," he stated.
Walsh told the Today program that the airline industry was facing substantial costs as a result of Monday's incident, possibly as high as a £100 million ($126 million) in total. He added: "I find it staggering, I really do – you know, the system should be designed to reject data that's incorrect, not to collapse the system.
"And if that is true, it demonstrates a considerable weakness that must have been there for some time.
"And I'd be amazed if that is the cause of this.
"Clearly we'll wait for the full evaluation of the problem but that explanation doesn't stand up from what I know of the system." ®