The TxFlow project built a distributed application permitting users to perform skimming (selection) operations on High Energy Physics (HEP) datasets. A Java application with a graphical interface permits users to specify workflowlike compound operations such as selecting input files, defining particles passing certain triggers or other characteristics of the data to skim such as cuts, and outputting results. Here a flow is a set of data analysis and data reduction steps that correspond to a particular configuration and execution of the software framework for the Large Hadron Collider (LHC) Compact Muon Solenoid (CMS) particle physics experiment.
The flow’s purpose is to select events that belong to experimental Triggers and perform Cut operations, producing a more fine-grained dataset of particle events. Once defined, a flow is submitted to a Grid service (also implemented in Java) which performs the skim operations on high-performance hosts. Given the multi-Terabyte data involved, all data provisioning and skimming operations must occur on the scientific Grid, and only summary results are returned to the client.
This section describes the high-level architecture of the TxFlow application. Fig. 3 shows the basic principles of the system’s design. It consists of a client that graphically assists the user in creating a workflow that is described in XML. A flow is submitted to the TxFlowService where it is interpreted by running the analysis software with a given configuration. A Flow Runner produces the correct input file and executes the software framework on the Grid. The flow is saved on the service in a database (the Flow DB) so that it can be shared and re-executed.
This section shows the TxFlow test architecture and how it invokes and validates the Web service methods. The TxFlow test system consists of many test classes that extend the JUnit framework to do extensive validation of all TxFlow application components including service connections and service method calls. Fig 4. shows the test classes TxFlowServiceTest, ServiceInstanceTest, and ServicePropertiesTest, which contain test methods to validate two of the main service classes, TxFlowService and ServiceProperties.
Fig 4 shows just a few of the many test classes. TxFlow has tests for nearly every function, including tests of methods specific to the client, methods specific to the service, and all project-wide utility classes and their methods. Tests also are implemented to validate service exception-handling behavior, such as TxFlowServiceTest. testGetNonexistentFlow(). The tests were developed in the course of application development following the TDD process, and so provide full test coverage of every system behavior. This full coverage enables rapid evolution of the code with continuous verification that the existing and new system functionality works as expected.