STSC Logo About Us Consulting Services CrossTalk STC Conference Resources


Software Technology Support Center


About CrossTalk

  - Mission
  - Staff
  - Contact Us


About Us

Subscription

  - Subscribe Now
  - Update
  - Cancel
  - 


Themes Calendar

Author Guidelines

Back Issues

Article Index

Your Comments
Home > CrossTalk Apr 2005 > Article

CrossTalk - The Journal of Defense Software Engineering
Apr 2005 Issue

Inside SEER-SEM
Lee Fischman, Galorath, Inc.
Karen McRitchie, Galorath, Inc.
Daniel D. Galorath, Galorath, Inc.

The System Evaluation and Estimation of Resources - Software Estimating Model (SEER-SEM) is a commercially available software project estimation model used within defense, government, and commercial enterprises. Introduced over a decade ago and now in its seventh release, it offers a case study in the history and future of such models. SEER-SEM and its brethren are built upon a mix of mathematics and statistics; this article provides insight into its inner workings and basis of estimation.

If you follow the roots of software estimation models, you will find many have common ancestors. The System Evaluation and Estimation of Resources - Software Estimating Model (SEER-SEM) began with the Jensen model and diverged significantly in the early 1990s. Barry Boehm's Constructive Cost Model work provided for the redefinition of some of the original Jensen model parameters into SEER-SEM. Don Reifer and Dan Galorath's work on the NASA Softcost model also found its way into SEER-SEM in addition to Halstead's software science metrics. The Jensen model itself was first calibrated using some of the same data as the Putnam model. Earlier work by Doty Associates introduced the idea of factoring in development environment influences via parameters. Work on this model continues today.

SEER-SEM's Architecture

SEER-SEM is composed of a group of models working together to provide estimates of effort, duration, staffing, and defects. These models can be briefly described by the questions they answer:

  • Sizing. How large is the software project being estimated?
  • Technology. How productive are the developers?
  • Effort and Schedule Calculation. What amount of effort and time are required to complete the project?
  • Constrained Effort/Schedule Calculation. How does the expected project outcome change when schedule and staffing constraints are applied?
  • Activity and Labor Allocation. How should activities and labor be allocated into the estimate?
  • Cost Calculation. Given expected effort, duration, and the labor allocation, how much will the project cost?
  • Defect Calculation. Given product type, project duration, and other information, what is the expected, objective quality of the delivered software?
  • Maintenance Effort Calculation. How much effort will be required to adequately maintain and upgrade a fielded software system?

Software Sizing

Software size is a key input to any estimating model, SEER-SEM being no exception. Supported sizing metrics include source lines of code (SLOC), function-based sizing (FBS) and a range of other measures. They are translated for internal use into effective size (Se). Se is a form of common currency within the model and enables new, reused, and even commercial off-the-shelf code to be mixed for an integrated analysis of the software development process. The generic calculation for Se is:

Se = NewSize + ExistingSize x (0.4 x Redesign + 0.25 x Reimpl + 0.35 x Retest)

As indicated, Se increases in direct proportion to the amount of new software being developed. Se increases by a lesser amount as preexisting code is reused in a project. The extent of this increase is governed by the amount of rework (redesign, re-implementation, and retest) required to reuse the code.

Function-Based Sizing

While SLOC is an accepted way of measuring the absolute size of code from the developer's perspective, metrics such as function points capture software size functionally from the user's perspective. The function-based sizing (FBS) metric extends function points so that hidden parts of software such as complex algorithms can be sized more readily. FBS is translated directly into unadjusted function points (UFP).

In SEER-SEM, all size metrics are translated to Se, including those entered using FBS. This is not a simple conversion, i.e., not a language-driven adjustment as is done with the much-derided backfiring method. Rather, the model incorporates factors, including phase at estimate, operating environment, application type, and application complexity. All these considerations significantly affect the mapping between functional size and Se. After FBS is translated into function points, it is then converted into Se as:

Se = Lx x (AdjFactor x UFP)(Entropy/1.2)

where,

Lx is a language-dependent expansion factor.

AdjFactor is the outcome of calculations involving other factors mentioned above. Entropy ranges from 1.04 to 1.2 depending on the type of software being developed.

Effort and Duration Calculations

A project's effort and duration are interrelated, as is reflected in their calculation within the model. Effort drives duration, notwithstanding productivity-related feedback between duration constraints and effort. The basic effort equation is:

K = D0.4(Se/Cte)1.2

where,

Se is effective size - introduced earlier. Cte is effective technology - a composite metric that captures factors relating to the efficiency or productivity with which development can be carried out. An extensive set of people, process, and product parameters feed into the effective technology rating. A higher rating means that development will be more productive. D is staffing complexity - a rating of the project's inherent difficulty in terms of the rate at which staff are added to a project.

The general form of this equation should not be a surprise. In numerous empirical studies, the effort-size relationship has been seen to assume the general form y = a x sizeb with a as the linear multiplier on size, and the exponent ranging between 0.9 and 1.2 depending on available data. Most experts feel that b>1 is a reasonable assumption, translated as effort increases at a proportionally faster rate than size. While SEER-SEM's value of 1.2 is at the high end of this range, the formula above is only part of the estimating process.

Once effort is obtained, duration is solved using the following equation:

td = D-0.2(Se/Cte)0.4

The duration equation is derived from key formulaic relationships (not detailed here). Its 0.4 exponent indicates that as a project's size increases, duration also increases, though less than proportionally. This size-duration relationship is also used in component-level scheduling algorithms with task overlaps computed to fall within total estimated project duration.

Time/Schedule Tradeoffs

In software projects, a limited exchange can be made between required effort and schedule. In fact, SEER-SEM optimizes according to minimum time or optimal effort scenarios. The first implies that a software project will staff aggressively to finish in the minimum amount of time, while the alternative permits schedule slippage for the sake of effort savings. The trade between minimum time and optimal effort is shown in Figure 1.



Figure 1: Effort Schedule Tradeoff
Figure 1: Effort Schedule Tradeoff

Staffing Constraints

Oftentimes specific staffing levels need to be factored into an estimate. Other factors aside, lower staffing leads to higher productivity per programmer while increased staffing reduces productivity. The dynamic relation between staffing and productivity can be described by an optimal staffing curve as shown in Figure 2.



Figure 2: Optimal Staffing Over the Project Life Cycle
Figure 2: Optimal Staffing Over the Project Life Cycle

The curve depicts optimal staffing over time for an idealized project. Its shape varies depending on project size and complexity. Areas around the curve illustrate the impact on individual productivity when staffing at any time varies from optimal. When staffing is too high, there is a productivity penalty as increased coordination is required while more staff must spend time getting up to speed. When staffing is too low, productivity increases due to tighter coordination among fewer staff and from team members who on average are more expert. Adding more staff may increase a team's ability to get work done but every additional person added is slightly less effective than the last.

Detailed Allocations of Effort and Duration

Project planners often need to know how a project's overall estimated effort and duration are allocated into specific activities and labor categories. While allocations are partially determined by patterns seen in past projects, they will vary for each project according to its unique characteristics. For example, there may be more or less requirements activity, testing, etc. Table 1 provides a typical allocation, by percentage, of project effort into a matrix of labor types and activities.



Table 1: Allocation of Activities and Labor for a Sample Project in SEER-SEM
Table 1: Allocation of Activities and Labor for a Sample Project in SEER-SEM
(Click on image above to show full-size version in pop-up window.)

Calibrating SEER-SEM

Key components of the SEER-SEM model have been described, but we have not discussed how it adapts to accurately estimate particular development scenarios, and how the model is kept current as software development technologies and methodologies evolve. The answer is simple: masses of ongoing research and analysis.

The modeling team regularly combs through raw data and industry studies to determine the latest trends and their impact on project productivity. As part of this effort, Galorath maintains a software project repository of approximately 6,000 projects (and growing). About 3,500 projects containing effort and duration outcomes are stored in a unified repository that can be readily accessed for studies. These are from both defense and commercial sources representing many development organizations, permitting calibration of the model to a wide array of potential projects. Additional project outcomes, in the hundreds, are also available to the company, which has also collected sizing and other information on thousands of additional projects.

Analysis involves running project data through SEER-SEM using a special calibration mode. The model is essentially run backwards to find calibration factors. These factors are evaluated across different data attributes (e.g. platform, application, etc.) to detect trends. A variety of methods are used to mitigate outlier data points and control for variation. The variance in the data set is also used to establish default parameter ranges; nearly all settings accommodate risk. Model settings are updated as new trends are established.

Galorath's work also is leveraged with findings from outside studies. For example, when examining relative language productivity, the company first uses its repository to empirically determine the impact of using different languages. However, because not all languages are well covered, it turns to outside sources that provide language descriptions, evolution trees, multidimensional comparisons, etc. Putting all this information together permits the company to make informed judgments about even rarely occurring languages.

Cost estimation models must be able to estimate a wide array of projects. This is accomplished with a significant number of modeling instruments, most of which can be independently set by the user:

  • Sizing Measures. Software's effective size varies according to many factors, and these factors change over time. As new languages are added to the developer's toolbox and old ones evolve, language mappings get updated. Sizing proxies also permit entirely new metrics to be added.
  • Knowledge Bases. New platforms (or operating environments) and applications are regularly being identified and added to SEER-SEM by way of its knowledge bases. Knowledge bases actually represent collections of parameter settings. Parameters in turn cover many different facets of the development process and of a software product's potential characteristics; new platforms and applications usually can be defined with a collection of parameter settings.
  • Allocations. According to project type, the balance shifts between types of activities and labor. Within SEERSEM, detailed activity milestone and labor allocation tables are used to establish baseline allocations, which are then further adjusted depending on project-specific settings related to requirements, testing, and so forth.
  • Internal Calibrations. Several internal instruments, both linear and nonlinear, permit high-level, systematic adjustments to estimates.

Beyond the Model

While this article has dealt exclusively with the core SEER-SEM model, other aspects of the tool are critically important to its practical application. Among its key design philosophies is the use of qualitative rating scales, user-selectable knowledge bases for basic calibration, and a work breakdown structure that differentiates between the system, program, and component levels. The SEER-SEM model will itself soon be complemented with a data mining system that produces entirely dynamic, data-driven estimates.


About the Authors
Lee Fischman

Lee Fischman is Special Projects director at Galorath Incorporated, where he develops new concepts, produces new software applications, and conducts research projects. His research interests include software metrics theory and novel applications of estimating algorithms. He received a Bachelor of Arts in economics from the University of Chicago and a Master of Arts in economics from University of California Los Angeles.

Galorath Incorporated
100 N Sepulveda BLVD
STE 1801
El Segundo, CA 90245
Phone: (310) 414-3222
E-mail: info@galorath.com



Karen McRitchie

Karen McRitchie is vice president of Development at Galorath Incorporated, and is responsible for design, development and validation of current and new System Evaluation and Estimation of Resources tools. For her longstanding contribution to commercial cost prediction tools, the International Society of Parametric Analysts honored her with its 2002 Parametrician of the Year award. McRitchie has a Bachelor of Arts in mathematics and system science from the University of California Los Angeles, and completed Master of Art degree work at California State University, Northridge.

Galorath Incorporated
100 N Sepulveda BLVD
STE 1801
El Segundo, CA 90245
Phone: (310) 414-3222
E-mail: info@galorath.com



Daniel D. Galorath

Daniel D. Galorath founded and is president of Galorath Incorporated. He has solved a variety of management, costing, systems, and software problems, performing all aspects of software development and management. His company has developed tools, methods, and training for software cost, schedule, risk analysis, and management decision support, including the industry standard System Evaluation and Estimation of Resources- Software Estimating Model. Galorath has a Bachelor of Arts and a Master of Business Administration from California State University, Dominguez Hills.

Galorath Incorporated
100 N Sepulveda BLVD
STE 1801
El Segundo, CA 90245
Phone: (310) 414-3222
E-mail: info@galorath.com



USAF Logo


Privacy and Security Notice  ·  External Links Disclaimer  ·  Site Map  ·  Contact Us