Validating the Analytical Results

Contact Martin Modell Table of Contents

CHAPTER SYNOPSIS

The process of analysis validation is as important as the original analysis. There are numerous methods available to analysts to validate their work, but not all validation techniques work equally well for all types of analysis.

This chapter describes some of the most commonly used techniques, explains how they can be used effectively, and provides a discussion as to the conditions under which they should be used, and the advantages and disadvantages of each.

Validating the Analysis

Having completed the analysis process and having documented the results, the analyst's last and most critical steps are directed toward the validation of the work. The validation process is performed to ensure that

All parties agree that the conditions as presented in the documentation accurately represent the environment.
The documents generated contain statements that are complete, accurate, and unambiguous.
The conclusions presented are supportable by the facts as presented.
The recommendations address the stated problems and are in accord with the needs of the users.

The analytical results not only represent the analysts' understanding of the current environment and their diagnosis of the problems inherent in that environment, but also their understanding of the user's unfulfilled current requirements and projected needs for the foreseeable future. A combination of these four aspects of analysis will be used to devise a design for future implementation. Thus it is imperative that the analysis be as correct and as complete as possible.

Just as there are multiple techniques available during the analysis process itself, so too there are multiple techniques available to validate the analysis. The goal of the validation process is to ensure that all the pieces have been identified, understood individually and in context, and described properly and completely.

A system is a complex whole. Old systems are not only complex, but in many cases they are a patchwork of processes, procedures, and tasks which were assembled over time and which may no longer fit together into a coherent whole. Many times needs arose which required makeshift procedures to solve a particular problem. Over time these procedures become institutionalized. Individually they may make sense and may even work; however, in the larger context of the organization, they are inaccurate, incomplete, and confusing.

Most organizations are faced with many systems which are so old and so patched, incomplete, complex, and undocumented that no one fully understands all of the intricacies and problems inherent in them, much less has a complete overview. The representation of the environment as portrayed by the analyst may be the first time that any user sees the entirety of the functional operations. If the environment is particularly large or complex, it could take both user and analyst almost as long to validate the analysis as it did to generate it, although this is probably extreme. The validation of the products of the analysis phase must address the two aspects of the environment: data and the processing of data.

Analysis seeks to decompose a complex whole into its component parts. For data it seeks

To trace the data flows into, from, and through the organization
To identify and describe all current and proposed data inputs, and to determine the organization's need for the various component informational elements of those inputs
To identify and describe all current and proposed data outputs and to determine the value, completeness, and accuracy of those outputs and their relevance to the intended recipients
To identify all current and proposed data stores (ongoing files), and to determine the value, completeness, and accuracy of those data stores and their relevance for their intended owners

For processes it seeks

To trace the processing flows and their component tasks, individually, in relation to each other, and in the context of the user function and the overall functions of the organization
To identify and describe all current and proposed data inputs which trigger those processes and the outputs which result from those processes, and to determine the organization's need for the process
To identify and describe all current and proposed results of the individual processes, the completeness and accuracy of the processing, and the relevance of that processing to their intended recipients
To place the process within the larger context of the functions of the firm

Validation seeks to ensure that the goals of analysis have been met and that the results of each of the three component parts of the analysis-- current environment, problem identification, and future environment proposal--are complete and accurate, solve the user's problems, and reflect the expressed needs of the user and of the business. It also attempts to ensure that for each component, the identified component parts re-create or are consistent with the whole.

To use an analogy, the analysis process is similar to taking a broken appliance apart, repairing the defective part, and putting the appliance back together again. It is easy to take the appliance apart, somewhat more difficult to isolate the defective part and repair it, and most difficult to put all the pieces back together so that the appliance works. The latter is especially true when the schematic for the appliance is missing, incomplete, or, worse, inaccurate.

The documentation created as a result of the analysis is similar to a schematic created as the appliance is being disassembled; the validation process is similar to trying to put the appliance back together using the schematic you created. In many respects the validation process is like the testing processes during implementation. The system may appear to be complete and may appear to have addressed all possible conditions, but that can only be assured by running exhaustive tests. These tests are usually run first at the unit level, then at the procedure level, and finally at the system level.

As with implementation testing, one must run through all possible combinations of logical pathing and must run all possible transaction types through the system to ensure that the implementation handles them properly. Test data should not be created by the people who wrote the code; likewise analysis validation should run through all possible logic paths, test all possible transactions, and be conducted by persons not associated with the actual analysis. The techniques for validation of the products of analysis seek to ensure that the verifier sees what is there and not what should be there.

By its very nature, systems analysis works to identify, define, and describe the various component pieces of the system. Each activity and each investigation seeks to identify and describe a specific piece. The piece may be macro or micro, but it is nonetheless a piece. Although it is usually necessary to create overview models, these overview models, at the enterprise and functional levels, seek only to create a framework or guidelines for the meat of the analysis, which is focused on the operational tasks. It is the detail at the operational levels which can be validated. The validation process of both data and process work at this level. Each activity, each output, and each transaction identified at the lowest levels must be traced from its end point to its highest level of aggregation or to its point of origination.

Many of the techniques which are employed in the analysis phases themselves may be used to validate that analysis. The primary differences between using a technique during original analysis and using it during validation are in what they are applied to and what the analyst is attempting to achieve.

In the analysis, the analyst is gathering facts, seeking to put together a picture of the current environment, and diagnosing that environment to determine any flaws in its structure or points where improvements may be made. By using this information, the analyst can create a picture of the environment as it could and should be in the future. At the outset, both the environment and the format and content of the pictures are vague at best and unknown at worst. If the analysis and the diagnoses are correct, the proposed environment will satisfy the user's needs both immediately and in the long run.

During validation, the analyst begins with an understanding of the environment and the pictures or models that have been constructed. The goal here, however, is to determine

Whether the analyst's understanding of the environment is complete and correct
Whether the depictions of the current environment matches what is actually there and the user's understanding of the environment
Whether the analyst's diagnoses correspond with the user's own perception of the problems and areas for improvement
Whether the proposed future environment will satisfy the user's perceptions of his or her immediate and long-term needs

It must be understood that the analysis, diagnoses, and proposals represent a combination of both fact and opinion. They are also heavily subjective. They are based upon interview, observation, and perception. The purpose of validation is to assure that perception and subjectivity have not distorted the facts.

The generation of diagrammatic models at the functional, process, and data levels greatly facilitates the process of validation. Where these models have been drawn from the analytical information and where they are supplemented by detailed narratives, the validation process may be reduced to two stages.

Stage 1

The diagrams are cross-referenced to the narratives to ensure that

Each says the same thing.
Each figure on the diagram has a corresponding narrative, and vice versa.
The diagrams contain no unterminated flows; there are no unconnected figures or ambiguous connections; all figures and all connections are clearly and completely labeled and cross-referenced to the accompanying narratives.
The diagrams are consistent within themselves, that is, data diagrams contain only data, process diagrams contain only processes, and function models contain only functions.
Each diagram is clearly labeled and a legend has been provided which identifies the meaning of each symbol used.
When the complexity of the user environment is such that the models must be segmented into many parts, each part is consistently labeled and titled, the legends are clear, connectors between the parts are consistent in their forward and backward references, and names of figures which appear in different parts are consistent.

Stage 2

Cross-referencing across the models ensures that

Processes are referenced back to their owner functions, and functions reference their component processes.
Any relationships identified between data entities have a corresponding process which captures and maintains them.
All data identified as being part of the firm's data model have a corresponding process that captures, validates, maintains, deletes, and uses it.
All processing views of the data are accounted for within the data models.
References to either data or processes within the individual models are consistent across the models.
All data expected by the various processes are accounted for in the data models.

Walk Throughs

Walk throughs are one of the most effective methods for validation. In effect they are presentations of the analytical results to a group of people who were not party to the initial analysis. This group of people should be composed of representatives of all levels of the affected user areas as well as the analysts involved. The function of this group is to determine whether any points have been missed, all the problems have been correctly identified, and whether the proposed solutions are viable. In effect this is a review committee.

Since the analysis documentation should be self-explanatory and nonambiguous, it should be readily understandable by any member of the group. Before the walk through, the group's members should read the documentation and note any questions or areas which need clarification. The walk through itself should take the form of a presentation by the analysts to the group and should be followed up by question and answer periods. Any modifications or corrections required to the documents should be noted. If any areas have been missed, the analyst may have to perform the needed interviews and a second presentation may be needed. The validation process should address the documentation and models developed from the top two levels--the strategic and the managerial--using the detail from the operational level.

Each data element or group of data elements contained in each of these detail transactions should be traced through these models, end to end, that is, from its origination on a source document through all of its transformations into output reports and stored files. Each data element or group of data elements contained in each of these stored files and output reports should be traced back to a single source document.

Each process which handles an original document or transaction should be traced to its end point. That is, the process-to-process flows should be traced and a determination made as to whether all identified inputs are available when the various processes require them. All redundant or ambiguous reports and files should have been noted and eliminated.

Processes should be associated with their owner functions and a note made as to whether any processes are ownerless or are multiownered. Comparisons should be made between processes to determine whether those with similarities in data needs, time frames, and task content have been combined.

Input/Output Validation

This class of analytical validation begins with system inputs and traces all flows to the final outputs. It also includes transaction analysis, data flow analysis, data source and use analysis, and data event analysis. For these methods each data input to the system is flowed to its final destination. Data transformations and manipulations are examined (using data flow diagrams), and outputs are documented. This type of analysis is usually left to right, in that the inputs are usually portrayed as coming in on the left and going out from the right. Data flow analysis techniques are used to depict the flow through successive levels of process decomposition, arriving ultimately at the unit task level.

Validation of these methods requires that the analyst and user work backward from the outputs to the inputs. To accomplish this, each output or storage item [data items which are stored in ongoing files (also an output)], is traced back through the documented transformations and processes to their ultimate source. Output-to-input validation does not require that all data inputs be used; however, all output or stored data items used must have an ultimate input source and should have a single input.

Input-to-output validation works in the reverse manner. Here, each input item is traced through its processes and transformations to the final output. Validation of output to input, or input to output, analysis looks for data items that have multiple sources as well as those that are acquired but not used.

Data-Source-to-Use Validation

Data-source--to--use validation seeks to determine whether the data gathered by the firm is

Needed
Verified or validated in the appropriate manner
Useful in the form acquired
Acquired at the appropriate point by the appropriate functional unit
Complete, accurate, and reliable
Made available to all functional areas which need it
Saved for an appropriate length of time
Modified by the appropriate unit in a correct and timely manner
Discarded when it is no longer of use to the firm
Correctly and appropriately identified when it is used by the firm
Appropriately documented as to the type and mode of transformation when it does not appear in its original form
Appropriately categorized as to its sensitivity and criticality to the firm

This method of validation is approached at the data element level and disregards the particular documents which carry the data. The rationale here is that data, although initially aggregated to documents, tend to scatter or fragment within the data flows of the firm. Conversely, once within the data flows of the firm, data tend to aggregate in different ways. That is, data are brought together into different collections and from many different sources for various processing purposes.

Some data are used for reference purposes, and some generated as a result of various processing steps and transformations. The result is a web of data which can be mapped irrespective of the processing flows. The data flow models and the data models from the data analysis in the various phases are particularly useful here. The analyst must be sure to cross-reference and cross-validate both the data model from the existing system and the data model from the proposed system.

The new data model should contain new data which must be collected, old data which is transferred intact, and old data which is transformed in some manner from the old to the new model. All data proposed in the new model should be justified, not only from a business need perspective, but also from a cost of acquisition, processing, and storage perspective as well. That is, the analyst must determine whether these costs are justified by the value of the data, both old and new, to the firm.

Consistency Analysis

Consistency analysis for data seeks to determine whether

All data elements have been appropriately named and defined
All transformations have been identified and described
All appearances of the data element have been noted and documented
Data elements have been defined in the proper manner and whether the documented transformations are correct
All calculations and data derivations have been identified and correctly defined and documented

Consistency analysis for processes seeks to determine whether

All processes link together appropriately
All processes are being performed in the proper time frame
The resources necessary for the process are available when needed
All proposed processes are consistent with the job descriptions of those expected to perform them
All proposed processes are consistent with the functional descriptions or charters of the organizations to which they have been delegated
The people charged with the responsibility for performance of the processes have the necessary authority
The supervisory personnel understand the processes in the same way that the persons actually performing them do

Overall, consistency analysis seeks to achieve an end-to-end test of the analysis products, looking for inconsistent references; missing data, processes, documents, or transactions; overlooked activities, functions, inputs, or outputs; etc.

In many respects consistency analysis should be conducted in conjunction with the other validation processes. The analyst and the user alike should be looking for "holes" in the document, not, as with the other techniques, to determine "correctness" but completeness.

Level-to-Level Consistency

Depending upon the size and scope of the project, the analysis may have been conducted in levels. That is, the analysis team may have first tried to achieve a broad-brush overview of the user function. This may have been followed by a more narrowly focused analysis of a particular function or group of functions, and working in successively finer and finer detail, finally arriving at the operational task level. This approach usually works each level to completion before beginning the next lower level. Although in theory the work at the level under analysis should be guided by the work at the preceding (higher) level, in practice, this may not be true.

The reality is that these levels may be under concurrent analysis or may be analyzed by different teams. The latter is especially true in long running projects where staffing turnover, transfer, and promotion continually change the makeup of the analysis team.

Level-to-level consistency validation seeks, in a manner similar to the traditional consistency analysis, to ensure that the information, perspective, and findings at each level correspond, or at least do not conflict with, the information, perspective, and findings at both higher and lower levels.

Carrier-to-Data Consistency

Carrier-to-data consistency focuses on the data transactions and ensures that the data on the documents (data carriers) which enter the firm are passed consistently and accurately to all areas where it is needed.

This phase of the validation seeks to determine whether the firm is receiving the correct data, whether it understands what data it is receiving (i.e., what the transmitter of that data intended), and whether it is using that data in a manner which is consistent with the data's origin.

Since, with few exceptions, the firm collects data on its own forms, and, with even fewer exceptions, the firm can specify what data it needs and in what form it needs that data, this level of validation can compare the firm's use of the incoming data with the data received to ensure that the forms, instructions, and procedures for collection or acquisition, and dissemination are consistent with the data's subsequent usage.

Process-to-Process Consistency

Process-to-process consistency seeks to trace the flows of processes and ensure that the processes are consistent with each other. That is, if one process is expecting data from another, then they both must have a common understanding of the data to be transferred. Process-to-process consistency validation seeks to answer the following questions.

Are the process time constraints similar between and across processes, and are all processes aware of those constraints?
Have all processes been identified?
Are there accompanying narratives for each process as a whole and for each component task?
Are all inputs, outputs, data storage and retrieval activities, and forms, reports, and transactions for each process clearly identified and described?
Have all forward and backward references for these items been clearly indicated? Do those references have corresponding references in the sourcing or receiving process?

Other Analytical Techniques

Zero-based analysis

Zero-based analysis is similar in nature to zero-based budgeting, in that it assumes nothing is known about the existing environment. It is a "start from scratch" approach. All user functions, processes, and tasks are reexamined and rejustified. The reasons for each user activity are documented, and all work flows are retraced.

Zero-based analysis is especially necessary for projects in areas undergoing resystemization or reautomation. Here the existing systems and automation may have been erroneous, or the environment may have changed sufficiently to warrant this start-from-scratch approach.

The analyst should never assume that the original reasons for collecting or processing the data are still valid. Each data transaction and each process must be examined as if it were being proposed for the first time or for a new system. The analyst must ask

Does this need to be done?
Are all the steps correct? And, are they all necessary?
Should this task or process be performed by the current unit?
Does the work accomplished justify the resources being devoted to it?
For each report
1. Is this report necessary?
2. If it is necessary, is it necessary in its current form?
3. Is it still necessary at its current level of detail?
4. Should it be produced as frequently as it is?
For each process?
Is it still necessary?
Does it need to be performed as frequently as it is?
If the process is manual should it be automated, and if currently automated, should it revert to manual?
Do the procedures and standards which govern the process still apply, or should they be revised?
1. Made simpler?
2. Made more comprehensive?
Can it be combined with other similar processes?
Has it grown so complex that it needs to be fragmented into a larger number of more simplified processes?
Can the cost of the process be justified in terms of the benefit to the firm?
Does the volume of work expected justify the size of the organization or the resources devoted to it?

Data event analysis

This modification of transaction analysis assumes that all business activities are triggered by data events or data transactions. These data events or data transactions are stimuli from either internal or external sources which constitute the day-to-day business of the firm.

These stimuli come from a variety of sources and cause the business to react in predictable ways. These data events may be manual or automated, or in some cases may result from the passage of time or from some other internal activity. There may or may not be any physical notification of the event. In the absence of any physical data carrier, or even if one is present, the analyst must determine

What the firm needs to know about the event?
How that knowledge may be validated?
What the firm should do with the information?
Where and how much of it must be stored for future use?

These actions and determinations should be analyzed irrespective of the persons or areas which may actually respond and of the format and content of the actual event notification. Each data event results in a data activity flow. A data activity flow consists of a series of data handling activities which are concerned purely with the receipt, validation, and movement of the data, and not with the processing or manipulation of that data. Data event flows may incorporate multiple processing steps or multiple processes.

Each data event flow starts with a data trigger of the firm from its initial identification or recognition and is traced from its source through its eventual storage in the files of the firm. Data activities are limited to

Receipt, identification, or recognition
Retrieval of any previously stored operational or reference data
Verification of the contents of the data trigger
Adding, updating, or deleting the items of interest from the data trigger to the previously acquired data within the firm
Archive any previous occurrences of the same data
Store the new data in the firm's files

Methods and procedures analysis<

Although the tendency is for data processing professionals to assume that all solutions to user problems must be automated data processing solutions, this is not always the case. The user's problems are business problems, not necessarily data manipulation or presentation problems. This methodology includes generally accepted principles of business systems analysis and integrates both automated and manual analysis, and automated and manual solutions. It analyzes the user's functions, processes, activities, and tasks and includes manual-to-manual, automated-to-manual, manual-to-automated, and automated-to-automated flows.

General to the specific analysis

The top-down approach to analysis and design requires that its first phases be, of necessity, rather general. That is the functions, processes, and data are treated in the abstract, as generic forms and as aggregates, rather than as specific tasks. In other words, rather than attempting to dive right into the detail of the tasks, the analyst should first try to put together less detailed (although not necessarily less complex) pictures of the firm, and as knowledge and understanding increase, to create successively more detailed pictures.

The analyst first attempts to gain an understanding, or overview, of the entire firm, the enterprise view, and from there proceeds to more specific and thus more detailed, views of individual user areas and activities. These detailed views are always developed within the enterprise framework. The enterprise framework is used to guide the development of and to validate the specific, detail views.

Contact Martin Modell Table of Contents

A Professional's Guide to Systems Analysis, Second Edition
Written by Martin E. Modell
Copyright © 2007 Martin E. Modell
All rights reserved. Printed in the United States of America. Except as permitted under United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a data base or retrieval system, without the prior written permission of the author.