< PrevNext >
NATO SOFTWARE ENGINEERING CONFERENCE 1968
113

9. Working Papers
(d) Performance assessment. This test would check on the performance of a package. It would determine the ‘efficiency’ of a package in terms of its core store requirements, peripheral requirements and running times under various conditions.
Approval at each level would be given as soon as those carrying out the tests had established, with an acceptable degree of confidence, that a package would provide satisfactory service in field use. We cannot call for perfections merely a reasonably high probability that the ‘average’ user will find the package usable. In practice it may be necessary to provide partial approval to a package, for example approved for use on machines over a certain size only.
7. Application of the proposed procedure
It is envisaged that each ‘level’ of testing should be carried out in the following manner:
(i) Documentation check. This would be carried out very largely during  the normal appraisal of the software before the machine was ordered, or at least as soon as the documentation became available. The work would take about three man-months for a major package. Every effort should be made to complete this work at least three months before a package’s proposed release date. This phase of the work would be done by attempting to use the documentation to write test programs for use in subsequent phases of the testing procedure. However, as the manuals are often produced in parallel with the package itself, it is to be expected that the programs produced will have to be amended in the light of subsequent amendments to these manuals.
(ii) Availability check. This should be carried out as soon as possible after the release of a package. Only a few test programs would be required and these would be written during the documentation check. The final debugging and running of these programs would form the availability check. It should be possible in most cases to accept the supplier’s own validation procedures for this check if, after investigation by those responsible for operating the ‘type’ approval scheme, it is found to be effective. It would probably still be necessary to carry out random checks to ensure that the company’s standards are being maintained. This phase should be concluded within a month of a package’s release. The effort required is likely to be less than one man-month per major package tested.
(iii) Detailed check. This level of approval is the most onerous part of the test procedure. To be of real value to the user, this check should be completed as soon as possible after the release of a package, certainly within three months unless there is no requirement by the user for the package in the near future. For such packages for which there is no immediate requirement, evidence of extensive satisfactory use by other customers could be accepted in lieu of a detailed check. Such a procedure might also be adopted for the lesser used packages that are unlikely to be vital to the functioning of the approval organization.
The detailed testing will be made up of two types of work: the testing of packages limited to one manufacturer, or even one machine, such as operating systems, and testing those items that tend to be machine independent, such as high level languages. The effort required for machine dependent items is likely to be large and to be a continuing load for the ‘approving’ body. However, the work involved on the standard languages,  such as Cobol, Algol and Fortran, can very largely be done ‘once and for all’ with only minor adaptations having to be made to suit each new system.
The effort required is likely to be about nine man-months to test an operating system and about three man-months to design and write test programs for the first test of the implementation of a well known language such as Fortran and about one man-month for each subsequent test of that language.
(iv) Performance check. Some idea of the performance of a package will be gained during the ‘availability’ and ‘detailed’ checks but it is desirable to have separate tests aimed at measuring performance or ‘efficiency’. These performance tests should be conducted after a few months of field use to allow operating experience to be gained and any residual failings to be removed. Such tests might be conducted three to six months after the release of a package. It might be easier to measure performance comparatively rather than absolutely, in that an absolute test of, say, a compiler’s efficiency would be very difficult and time-consuming to carry out but for a comparative test, a variety of benchmark programs might suffice. However, for machine dependent packages, such as operating systems, absolute measurements will have to be made.