Home arrow Articles arrow Technical arrow Simulator Reliability, Failure, and Repair Times
Thursday, 09 September 2010
Advertisement
Main Menu
Home
News
Forum
FAQs
Articles
Employment
Bookmarks
Downloads
Search
Contact Us
Privacy & Terms
Newsletter
Administrator
Login Form
Welcome, Guest. Please login or register.
09 September, 2010; 09:27 GMT
Username: Password:
Login with username, password and session length

Forgot your password?
Simulator Reliability, Failure, and Repair Times Print E-mail
Written by Antonio R. G. Pineda   
Saturday, 20 January 2007

Introduction and Definitions

We all hear people, from different ends of our operations, giving their opinions about how good or bad the simulator is; all based on their own subjective assessments.

  • This simulator isn't very reliable
  • The simulation does not seem to be very reliable
  • This simulator fails too often
  • You have to be able to demonstrate a high reliability operation
  • We need to know your MTBF and MTTR in order to evaluate the efficiency of your operation

We can be left feeling that no matter how much we do, it never seems to be enough to satisfy anyone - from pilots to authorities - not even our own technical manager. However, their allegations may have some substance, because we don't have a number that we can give them to prove where we are. We do not have an objective way to measure the device, and at the end of the day we really do need one to keep all the interested parties satisfied.

The objective of this article is to provide a general idea of how we can handle these demands; to clarify the terms required to do so; and most importantly of all, to come up with a mechanism that will allow us to compare our devices to others - since the measures are only any good if they can be compared throughout the industry.

Simulators and Other Machines

Simulators are machines. These machines do have a unique application - to provide training to pilots and cabin crews; but we should treat them like any other machine: a washing machine; a car; a microwave oven; a printing press; a packaging machine, etc. None of these are different in terms of objective study of their working patterns and maintenance requirements should provide solid evidence with which to measure their overall performance.

It seems common sense to make this comparison, and it should be that way, once we realise and take into account the peculiarities of a simulator. We will be able to treat it in the same way as a car, or any other machine.

You use your car every day. You start the engine every day. You drive it every day. If you can manage to do that for a whole year, and you were asked "How is your car?" you will have no doubts in stating "My car is pretty reliable, never let me down", meaning you are almost certain that every time you want to use it, it will be there for you, and will drive you without delays, or failing to do everything you expect of it.

The same idea applies to a simulator. If every time a crew wants to use it they can do so, the simulator is consider to be a reliable machine.

Of course, we all know that every 5000km we need to change the oil of our car, and every week we should check the hydraulic oil level of the simulator. We need time to do this, and during that time we cannot use the car or simulator for its primary purpose - to carry us around, or to train pilots, respectively.

These tasks are called preventive maintenance, and the time taken to carry out these tasks are not generally considered to be time that the car or simulator is in use.

The problem is when you want to drive your wife to the supermarket and you can't because the car doesn't start because of a flat battery that takes two hours to get recharged sufficiently to start the engine. The same applies to a simulator when the motion won't engage and the crew is delayed by one hour. This is known as an interrupt, and the actions needed to resolve the problem are known as corrective maintenance.

Measuring Reliability

Lets start defining what the different terms used in an industrial environment are, and see how we can apply them to our area of interest.

Most people have a vague concept of reliability, and they associate reliability with quality, but these two terms are not equal. We will try to shed some light onto the following terms: Reliability; MTBF; and MTTR. We will not cover the meaning of the term "Quality" here, as it's a complex subject and will need a complete different article to do it justice.

Searching dictionaries or technical manuals for definitions of reliability you will find many different opinions on what constitutes a reliable machine. Here, we will go for a simple approach, hoping this satisfies most of our audience of highly experienced engineers, and hoping to catch the interest of managers and to start a debate on what reliability means to simulators, and how it should be measured and reported.

One definition of reliability is:

Collins Electronic Dictionary:
adj: able to be trusted; predictable or dependable.

A measure of reliability; however, is a way to show, as a percentage, how many times the operation of a device was successful, or fulfilled its intended operation.

That sounds simple enough, but how do you go about getting the figure you're looking for? Lets build up a solution using our car as an example:

Your car drives you every morning for a whole month - 30 days, and in the last month you managed to drive it for every one of those 30 days without it ever having let you down. This means that your car was 100% reliable last month.

The following month you were only able to drive your car on 25 days out of the 30-day month. During the remaining five days the car was in a garage due to an oil leak on the power steering. This time it wasn't 100% reliable, but how reliable was it?

We can obtain a figure as follows:

  • R = Reliability
  • Tp = Number of times usage was planned
  • D = Number of successful attempted usages
  • Result:
  • R = ( D / Tp ) * 100
  • R = ( 25 / 30 ) * 100
  • R = 83% Reliability Index or Percentage

Is this a good figure? What is the car next door doing?

Before we try to compare our car to the car next door we need to know that the one next door is used in a comparable way. If we try to compare our car, which is used only once per day with a delivery van used 10 or more times a day - apart from the fact that it is a van - the comparison wont be valid.

What's needed is a measure of usage. Usage is the amount of possible days you can use the car within the month, compared to the number of actual days we try to use it. I have the car available 30 days, but I only use it on 20 of those days: I do not use it during the weekends!

  • U = Usage
  • Ta = Number of times the device is available for use
  • Tu = Number of times an attempt to use the device is made

Result:

  • U = ( Tu / Ta ) * 100
  • U = ( 20 / 30 ) * 100
  • U = 66% Usage Index or workload of the device

Once we know the workload (usage) we can compare the reliability and know that the comparison is based on equivalent demands on the compared devices. Common sense dictates that reliability figures will only be valid if they compare like-with-like.

It should now be relatively easy to apply this to simulators. We only need the following numbers to calculate our usage and reliability.

  • Havailable = Total number of hours per year the device can be booked per year. Excluding preventive maintenance time and configuration change times.
  • Hbooked = Total number of hours actually booked and used per year.
  • Hdown = Total number of hours of downtime per year caused by unexpected failures.

Usage per year will be:

  • Usage = ( Hbooked / Havailable ) * 100 %

The higher the figure the better - meaning more money made.

Reliability per year will then be:

  • Reliability = ( ( Hbooked - Hdown ) / Hbooked ) * 100 %

Again, the higher the figure obtained the better - meaning more sessions completed and hopefully more customers satisfied. However, something's still missing: What about the quality of training obtained? This will be left for later.

So far so good, we have covered the reliability and usage details, and it's pretty straightforward: No complex computer program needed, only good record keeping.

Failure Rates

Moving on to the other measurements that aren't so obvious: MTBF and MTTR.

MTBF - Mean Time Between Failures

Put simply: The average number of hours the simulator can be expected to work without a failure occurring.

The calculation would seem simple, but we need to collect new data to be able to calculate it: The number of interrupts or failures causing a loss of planned usage time (or an increase in downtime).

  • Nint = Number of Interrupts
  • Hbooked = Total number of hours actually booked and used per year

With these new parameters we can calculate a MTBF

  • MTBF (in hours) = Hbooked / Nint

Yet again, the higher the figure the better - meaning you need a lot of hours between interrupts, or a low number of interrupts fir the number of hours used.

MTTR - Mean Time to Repair

It can be quite useful to compare the average time taken to affect repairs that have caused interrupts: So MTTR is the average time required to repair any fault that has caused an interrupt.

To calculate this we need the number of interrupts that have occurred, and the time taken for each to be repaired. These two figures can then be used to obtain an average time to repair.

If the actual repair times aren't available an alternative, but less useful figure can be obtained by simply dividing the total downtime by the number of interrupts. This however, doesn't tell the full story, because a simulator maybe out of service while waiting for a critical spare part (logistics time).

When comparing MTTR it is important to be clear which method has been used ensure that like is compared to like, or the results will not be valid!

Simulator Reliability, Failure, and Repair Times - Summary

To recap: To calculate the main figures that we need to be able to present reliability and MTBF and MTTR figures that can be used to provide repeatable measures to management and third-parties, we need:

  • Havailable: Total number of hours per year that the device can be booked for training.
    Excludes preventive maintenance time and configuration change times.
  • Hbooked: Total number of hours actually booked and used per year
  • Hdown: Total number of hours of downtime per year caused by unexpected failures.
  • Nint: Number of Interrupts per year

With the above figures we can provide statistics for: Usage; Reliability; MTBF; and MTTR per device, per year.

Is it really so simple? Yes and no. It is from the industrial point of view, but we mustn't loose sight of the fact that Simulators are complex devices, and a lot is expected of them in terms, not only of reliability, but of quality of training as well. Here we've covered the reliability side of the user's expectations. The quality of training obtained will, as said earlier, need to be covered elsewhere.

Interest in the Simulation Community

Having got this far, we should think about who is interested in all of this.

One of the most important groups that want to know these figures are the civil aviation authorities, so that they can get a feel of how well maintained the simulator is, and whether it's fulfilling its primary purpose of providing valuable and uninterrupted training to the crews that use it. Industry bodies are also very interested in this data and how to calculate it: The FSEMC and ARINC produced a set of standards a long time ago that define simulator metrics. ARINC report 433, Standard Measurements For Flight Simulator Quality, goes into great detail and tries to take into account all possible scenarios. But how many operators are actually following this standard as presented in the report?

Quality managers, very much in control these days, are looking forward to see these figures, not only for their own devices, but for those of other operators as well.

Typical FSEMC Reliability Discussions

From the minutes from the FSEMC meeting that took place in Thailand in September 2006. Reproduced courtesy of ARINC: The full report can be obtained from the ARINC Web site

Open Q&A Session Discussion Covering the ARINC 433 Report

WEISS/BOEING - I wanted to mention is back to Item 13 where it referenced ARINC 433. I wanted to let the conference [know] that ARINC 433 is up for an update. ARINC has canvassed the industry to see if there is interest in people working on a task group to look at ARINC 433 again and provide updates if necessary. There is interest there. If anybody does not or has not heard about that, I wanted to let the conference know. If they do have an interest in supporting a task group to look at and update ARINC 433, please contact Sam Buckwalter at ARINC.
JACKSON/FEDEX - One of the themes that has gone back and forth in this conference and always does is collection of data with regard to simulator failures in reliability. At FedEx we have our own system for collecting data on simulator faults, and I'm sure other people have their own systems, as CAE and TT&S both have their built in system. If in ARINC 433 there was a standard for the exchange of this data, it would be possible for somebody to collect the data and create a data base of real data as to what simulators’ real MTBF is across the industry and what are common failures and things like that.
CRONAN/ROCKWELL COLLINS - Along the same lines as FedEx’s comments, my question is specifically to Todd from Airbus. He mentioned MTBF any where between 10 and 30 hours. I was wondering if that was actual hardware failures or are you also counting interruptions for that. And I'd like to know if what Todd is experiencing is typical of most of the users in here.
METTS/AIRBUS - I would say that our lost time is really any call, whether it is the crew’s fault, or whether it is hardware, software, anything for the most part. That does vary by month to month. So it is not a really good base line at times, but it does help if you have the same type of sims in the same place, then there is baseline to say whether this one performs well or not. Besides that, if you have only one type of a sim it’s harder to tell, unless you look at the past. It would be neat to see and look at others. That might work against us too. My boss might say we'll look there.

Interface Discussion

FedEx is interested in the new SIM XXI linkage technology and is requesting customer feedback regarding reliability, stability, and supportability. What are the major problems encountered by end-users with SIM XXI devices? Are these end-user identified problems being resolved in a timely manner? Other users and vendor comments, please. Value of Resolution: Reduce costs (acquisition or operation)

Motion and Control Loading Discussion

Questions raised by Cathay Pacific Airways (CPA).
  • Are there any simulator operators who have electrical motion in use?
  • What is the feedback on reliability, maintainability, and fidelity?

SimUser Web Site Discussions

The SimUser Web site has also seen fairly active discussion on reliability topics. This started in November 2002, and continues, on-and-off, to this day. The discussion topic has had, at the time of writing, 68 replies and has been viewed a staggering 3626 times! It seems that simulation engineers have a lot of interest in this subject. However, after more than a year of debate there was no consensus on how or what to do. This shows how difficult it is to get a common approach to this issue. The discussion can be viewed and joined in the SimUser Web site forums.

What is the Way Forward?

Everybody seems to be interested in seeing figures for reliability, and especially to compare their own figures with those from other simulators or operators.

Other than the basic, overall figures, some would like to break these down even further into systems, sub-systems, manufacturers, or even to component level!

One problem in getting an industry-wide appreciation of simulator reliability is due to the nervousness of letting other operators see figures that you may be embarrassed about. The fact that, based on an industry average, you're doing quite well, is a complete unknown.

Simulator Reliability Program (SRP)

SimUser is proposing that based on the calculation methods above, as many operators who feel able to, submit their data per device. This data would be handled in strict confidence and not maintained in its raw form on the Web site. SimUser will collate the data made available and post generalised statistics that will be freely available.

The data that would be needed to make this possible is as follows:

  • The Usage percentage for the past year
  • The Reliability percentage for the past year
  • The MTBF for the past year

For each device, for which data is supplied, SimUser will provide a unique reference code to the operator so that its own devices can be identified. The codes issued will not be made public.

The suggested codes will be in the following format: TTTNNN, where TTT will be a code for the Training Centre, and NNN will be a unique code ID for each simulator at that centre. If the manufacturer, technology used, and simulator type (aircraft simulated, etc) are also supplied this would make the data far more useful.

Data should be submitted by email, and from a valid email source. SimUser will then provide the code issued to the email address used to submit the data. Full name and address must be supplied for the source data in order to avoid false entries.

If you are willing to participate by supplying your training centre's data in complete confidence, please do so by email to: This e-mail address is being protected from spam bots, you need JavaScript enabled to view it .

References

ARINC Report 433 Standard Measurements For Flight Simulator Quality, covering simulator metrics can be obtained from ARINC. The current price is US $98.

Discussion

This article can be discussed in the forums

(0) No Comments. Start a discussion...

Last Updated ( Monday, 26 March 2007 )
 
< Prev   Next >