A Real World Software Assurance Test Suite
A test suite based on real-world software and having a wide variety of weaknesses inserted using a systematic process would be very valuable for evaluating automated software assurance tools. The Intelligence Advanced Research Projects Activity (IARPA) Securely Taking on Software of Uncertain Provenance (STONESOUP) program  has produced such a test suite. The STONESOUP Phase 3 test suite consists of 7769 individual test cases, of which 4581 are in C and 3188 are in Java. The test cases are derived from 12 real-world, open source applications, which are the base programs. This is the first test suite of its kind (to our knowledge) to be based on large, real-world applications. Our presentation will describe the test suite format and contents, as well as the organization of the test cases in the test suite. Additional relevant information pertaining to the test suite including the test case naming convention and how to specify the metadata xml file will also be provided, containing all the instructions needed to build, execute and score a given test case. This test suite can be understood as a natural code test suite covering many weaknesses for C and Java. The base programs do not have any intentional weaknesses. The test case generation process involves: starting with a base program, injecting a weakness into that program, manually inspecting that the weakness was implemented correctly, and creating variants (snippets) off this base program by changing where the weakness was injected. Pairs of input values and expected outputs are also created as part of the test case generation process. Possible uses of this test suite include, but are not limited to, investigating tool behavior (testing bug catching/exploit prevention programs). A tool running on this test suite should stop the bad inputs from causing problems, but should not interfere with handling the good inputs. Since there is control over the weaknesses seeded and the input/output pairs used on the base programs, it is possible to study a wide range of tools’ capabilities over larger data sets with a greater variety of reachable weaknesses. There is also some control over where weaknesses occur, in order to allow for automated scoring of results. Finally, there is some general indication as to where data flow, control flow, and data transfer complexities are inserted in the code, in order to study the depth of tools’ analysis. The National Institute of Standards and Technology (NIST), an agency of the U.S. Department of Commerce, currently hosts the STONESOUP Phase 3 test suite as a virtual machine with all 7769 test cases loaded. It can be accessed from the Software Assurance Metrics and Test Evaluation (SAMATE) Software Assurance Reference Dataset (SARD) test suites page .
This presentation has not yet been uploaded.
No handouts have been uploaded.
Charles Oliveira (Primary Presenter), National Institute of Standards and Techonolgy, firstname.lastname@example.org;
Software Engineer, Guest Researcher at Software and Systems Division, IT Laboratory National Institute of Standards and Technology