Metrics-Based Problem Detection in Object-Oriented

Legacy Systems Using Audit-Reengineer

 

Anne-Marie Sassen
Software Engineering Division - Sema Group Sae, Spain

and

Radu Marinescu
"Politehnica" University of Timisoara, Romania
 
 

Abstract

Audit-Reengineer is a product based on Concerto2/Audit, SEMA´s tool for quality assessment, and on the results of ESPRIT project 21975 FAMOOS. In this article we will describe the specific functionality Audit-Reengineer contains for reengineering, with a special emphasis on the results we obtained in using object-oriented metrics for problem detection in legacy systems.


1.  The Need for Re-engineering

The increasing reliance on information technology for consumer and industrial goods imposes new requirements on software flexibility. The major trends in customer requirements are customer specific modifications and software versions (custom made systems), much faster response to change requests and new requirements (evolution), and the ability to easily modify the software based on the immediate customer needs (tailoring).

Object oriented programming has often been promoted as the most effective approach to build inherently flexible software, and was quickly adopted by industry in the recent years. There are already applications consisting of millions of lines of code developed in several hundred man-years. While the benefits of object oriented technology are widely recognized, its utilization does not necessarily result in general, adaptable families of systems. These large applications are often suffering from improper use of object oriented techniques (like inheritance) and the lack of object oriented methods being geared towards the construction of families of systems instead of building single applications. The result is a new generation of inflexible legacy  systems.

In order to better meet customer requirements, the industrial users need to re-engineer these monolithic object oriented legacy systems to flexible frameworks  and libraries of small, understandable software components. Such frameworks will allow a greater flexibility to varying needs of different categories of customers, as well as an easier integration of new requirements:



2. The FAMOOS Approach to Re-engineering

Within the FAMOOS project an approach to re-engineer object oriented legacy systems to frameworks has been developed. This approach is described in the FAMOOS Handbook [Ciupke 99]. Because of the complexity of managing systems consisting of millions of lines of code, besides this handbook, an adequate tool is needed to support the FAMOOS methodology. This tool is Audit-Reengineer.

Within FAMOOS the following reengineering life cycle model is defined:

    1. Requirements Analysis: Identifying the concrete reengineering goals
    2. Model Capture: Documenting and understanding the software system
    3. Problem Detection: Identifying flexibility and quality problems
    4. Problem Resolution: Selecting new software architectures to correct the problems
    5. Reorganization: Transforming the existing software architecture for a new release
    6. Change Propagation: Ensuring that all client systems benefit from the new release
Until now, Audit-Reengineer supports phase 2 and 3, and the metrics we will present in this paper are also intended to be an aid for these two phases.

3. Functionality of Audit-Reengineer

Audit-Reengineer is based upon the tool Concerto2/Audit [Audit 98]. It provides:



4. Using the Metrics Included in Audit-Reengineer for Problem Detection

After a careful evaluation of existing metrics for detecting problems of flexibility problems within the FAMOOS project [Mar 97] we selected the following metrics for Audit-Reengineer. We evaluated the hypotheses formed on earlier case studies with two new case studies. Case A was a C++ system developed with the Microsoft Visual C++ programming environment, in conjunction with the Microsoft Foundation Classes for Windows NT, and has 51 classes, and 54.104 lines of code. Case B was a C++ system developed with the Borland C++ compiler for Windows'95 using the Object Windows Library with 81 classes, and 16.581 lines of code.

4.1. Weighted Method Count (WMC) [Chid 94]. The WMC of a class is defined as the sum of the complexity of each method of that class. The way complexity is defined for an implementation of this metric is a decision that can taken be in different ways. We have followed the suggestion of Li and Henry [LiHe 93] by using McCabe's Cyclomatic Complexity Metric, defined as "the number of linearly independent paths and therefore, the minimum number of paths that should be tested" [McCa 76]. Another possibility is to assign a unitary complexity to each method. In that case the WMC of a class is equal to the number of methods of that class.

We developed the following hypotheses about this metric:
Hypothesis 1.  The outliers are the central (major) classes in the project, being the main control classes.
Hypothesis 2.  The outliers are more error prone and harder to maintain.
Hypothesis 3.  Outliers have few or no children; classes with high WMC values having many descendants are critical from the maintenance point of view and their redesign should be considered.
Hypothesis 4.  Classes with large numbers of methods are likely to be more application specific, limiting the possibility of reuse.
 

Site Name Minimum Average Maximum
Case A 1 41 260
Case B 1 26 179
Table 4.1. Summary Statistics for the WMC metric

Hypotheses Validation

  • Hypothesis 1. A general observation is that for both sites over 85% of the classes are simple. There are few outliers, but these have a very high WMC. In both cases, the outliers proved to be the central classes in the project, i.e. classes that manage the whole application. This confirms our first hypothesis. This information can be used at the beginning of a reengineering process for unfamiliar software, because it is always much easier to start the study by analyzing the main classes.

  •  
  • Hypothesis 2. We cannot conclude from the two cases studied that the outliers are more error prone and harder to maintain. This might be caused by the fact that the two case studies are not legacy systems. The two use-cases are still being maintained without any specific difficulty, also because the ones woh maintain it are mainly those who designed and implemented it.

  •  
  • Hypotheses 3-4. In Case A the outliers have very few children (mostly 0 or 1 child classes), reducing thus the impact of changes in the derived classes. In Case B one of the outliers had 7 children, it is in fact the class with the highest number of children. The designer of this system however does not think that the design of this class should be reconsidered. This class delivers services to its children to store and read themselves to a database. In fact, this is also a class which might be very easy to reuse in other projects (when they need to connect to the same database system). The other outlier classes are indeed more application specific (8 out of 10) which confirms the 4th hypothesis. Of course, this hypothesis is not an "iron rule"; there might be classes where in spite of its high WMC value it still is very reusable, but this does not alter the general rule.

  •  

     

    Average Method-Complexity of a Class
    We also analyzed the classes that gather much complexity in a few methods. These are classes with a high ratio of WMC (based on McCabe's cyclomatic complexity) and WMC (based on unitary complexity of each method) [Mari97]. We expected that these methods could be split, distributing in this way the complexity towards more methods. This has also the advantage of increasing the potential reusability of the class.  Speaking to the developers of Case A we found out that the few methods of the outlier classes contained huge selector structures ("switch-case" in C/C++). The designers admitted that although these huge methods do neither affect the maintainability nor the understandability of the class, it would be a wise decision to split them in more methods. This observation encourages us to look for a future validation of this observation on other projects. In Case B we also found the classes with a few number of very large methods this way. In this case, classes could not be split, but complexity could have been distributed over more methods.
     
     

    4.2 Data Abstraction Coupling (DAC) [LiHe 93]. A class can be viewed as an implementation of an abstract data type (ADT). A variable declared within a class X may have a type of ADT which is another class definition, causing a particular type of coupling between the X and the other class, since X can access the properties of the ADT class. The DAC of a class is defined as the number of ADT's defined in a class, and hence it measures coupling complexity.

    We developed the following hypotheses about this metric.
    Hypothesis 1.  Outlier classes are mainly the central control classes of the system. DAC outliers that are not central classes are undesired and their redesigned should be considered.
    Hypothesis 2.  The outliers are harder to maintain, as they as they will often be due to change because of the classes they are depending on.
     
    Site name Minimum Average Maximum
    Case A 0 9.24 37
    Case B 0 4.03
    26

    Table 4.2.  Summary statistics for the DAC metric

    Hypotheses Validation

  • Hypothesis 1. For both Case A and Case B we observed that almost all outliers (90% in Case A and 80% in Case B) of the DAC metric are the same classes as the outlier of the WMC metric. This confirms our hypothesis and it is normal because classes that manage the whole application are coupled to a lot of other classes. In Case B we detected an exception. That class is not very important in the total design, and might be a candidate for making it less dependent on other classes. For the four other outliers this is not possible, since these are the central, main control classes. >From the observations above we learn that in order to detect classes that have a too high coupling, we should analyze the classes with high DAC values, that are not among the outliers for WMC. These might be candidates for improvement.

  •  
  • Hypothesis 2. Again, we didn't observe any problems related to maintenance, but this might be due to the fact that these were no legacy systems.

  •  

     
     
     
     

    4.3 Change Dependency Between Classes (CDBC) [Hitz 1996]. This metric determines the potential amount of follow-up work to be done in a client class (CC) when the server class (SC) is being modified, by counting the number of methods in the CC that might need to be changed because of a change in SC.

    The metric is defined not on a single class, but on the a client class - server class pair . This allows us to define hypotheses from different perspectives:
    Hypothesis 1. Client perspective:
                - CC with high (and many couples) are in the "heart" of the design (similar to WMC)
                - CC outliers are the very hard to maintain as they strongly depend on many other classes
                - CC outliers are more error prone and harder to understand.
    Hypothesis 2. Server perspective: classes that are mostly used by other classes should be stable. If not, they should be consolidated.
    Hypothesis 3. Pairs of classes with mutual strong dependency (high CDBC) are not desirable.

    Hypotheses Validation

  • Hypothesis 1. Looking at client classes with the highest CDBC, in both Case A and Case B we noticed that although with this metric important classes are found, the outcome of the metric WMC is better in this respect. For CDBC we find among the outliers some couplings that are in fact implementational matters, and not so important for the total design. Thus, we conclude: to find the most important classes of the project, the real control classes, it is best to use WMC. To find the classes which are mostly dependent on each other at the implementation level look at this metric.

  •  
  • Hypothesis 2. Observing the results from the server perspective we found out that in Case A the classes that offer services to most other classes are with no exception placed in the high levels of the class hierarchy, having a small number of children. We suppose that most of these classes are in fact classes defining an interface, that will be polymorphically used at run-time. Because of polymorphism, objects are declared as instances of the abstract class defining the interface. In Case B however we found this way real service classes offering services to other classes . It were mostly leaf classes.

  •  
  • Hypothesis 3. In Case A no relevant conclusions could be drawn from concerning the strong bi-directional coupling. In Case B we found two pairs of classes that are mutual strong dependent on each other. The designer of this system agreed that these couplings could be removed, to make the classes better reusable. A future research on this metrics would be to continue analyzing the characteristics of class pairs where the coupling is strong in both directions.

  •  

     
     
     
     

    4.4 Tight Class Cohesion (TCC) [Biem 95] is defined as the relative number of directly connected methods. Two methods are directly connected if they use one common data member of the class. In the previous definition a data member is considered to be "of the class" if and only if it is declared in that class, meaning that inherited data members are not counted for this metric. This can be justified by the fact that this metric is a measure of cohesion, while the use of inherited data members is a matter of coupling. The outliers for the TCC metric are considered the classes with the lowest TCC values.

    We have formulated following hypotheses about this metric:
    Hypothesis 1. Classes with low TCC are not cohesive and might be split in two or more classes.
    Hypothesis 2. Classes with low TCC incorporate more than one functionality.
     
    Site Name Minimum Average Maximum
    Case A 0.02 0.28 0.67
    Case B 0.07 0.45 1

    Table 4.3.  Summary statistics for the TCC metric

    Hypotheses Validation

  • Hypotheses 1 and 2. In both case studies the set of outliers was significantly altered by two types of "false positive" classes, i.e. classes with empty constructors and/or destructors; and data based classes provided with get/set methods, where all the other methods access the data through these methods. By filtering the "false positives" we detected in Case A those classes that indeed implemented more than one abstraction, and that could be split. In Case B, of the 5 classes with the lowest cohesion, only one implements more than one functionality and could be split. Three others are very small classes which do implement only one function, with few attributes. And the fifth one is the big class with many get/set functions.

  •  

     

    Conclusive Observation
    Comparing the results for TCC of Case A with Case B and the case studies analyzed in the past, both the average and the maximum values are very low. Our assumption is that the reason for this is not the poor quality of the structural design, but the strong altering impact that the large number of "false positive" classes cause on the statistic values. This leads us to the conclusion, that in order to interpret in an efficient manner the results of this metric, a filtering of the "false positives" is necessary.
     
     

    4.5 Reuse of Ancestors (RA) [Mar 98]) The RA metric, measures the real code reuse of an ancestor class A within a derived class C. RA is calculated as the sum of the reuses of class A in all methods of class C, divided by the total numbers of methods defined in class C. In [Mar 98] two different ways to calculate the reuse at the method level were proposed:
                i.  the reuse of class A in method m is 1, if m uses at least one member of A, and it is 1 otherwise. (RA-Unitary)
                ii. the reuse of class A in method m is the relative number of members from A that are used in method m. (RA-Percentage)

    We have defined the following hypotheses about RA:
    Hypothesis 1.  Outliers are classes that reuse a lot of code from ancestor classes and while reuse is a form of coupling the maintenance effort for these classes will be high.
    Hypothesis 2.  Very high values are a sign of misusing subclassing.
    Hypothesis 3.  Low average values RA is a sign of a poor OO design.
     
    Site Name Minimum Average Maximum
    Case A 0.02 0.43 0.60
    Case B 0.10 0.45 0.90

    Table 4.4.  Summary statistics for the RA-Unitary metric

    Observations
    In Case A taking a closer look at the ancestor class for the class pairs that have a RA value (RA-Unitary) higher than 0.3  we observe that a few number of ancestors are reused in the derived classes. Speaking to the designers, we found out that the ancestor together with the set of derived classes that use it build a semantically related class cluster. This could be an important information for the model capture. phase of the re-engineering process.

    A second observation was made when calculating for a class the reuse of all its ancestors (based on RA-Percentage). The majority of the outliers where "light classes" i.e. classes with few methods. That shows us that most of the use of ancestor member takes place in a few number of methods.  This is normal, as classes that were derived in order to reuse the code of their ancestors are only refining the ancestor class. Classes that do not conform to this rule, are not  good.

    Hypotheses Validation
    Concerning our hypotheses,  in Case B we could not confirm them. The outliers indeed re-use a lot of code of their parents. All outliers were a specific kind of form that could be printer. It was in fact no case of misuse in sub classing, but just using the advantages of object oriented programming. The outliers were all leaf classes.



    5. Conclusions

    In this article we have described Audit-Reengineer, a tool for reengineering of large object oriented legacy system, trying to assess the suitability of the metrics included in the tool from the perspective of re-enineering. We validated the tool on two medium sized case studies. The general conclusion is that the metrics included in the tool work very well for model capture. For problem detection however, we found less evidence of their suitability. A likely reason for this is that the case studies were well designed and that they are no legacy systems. In fact they are maintained until now without any specific difficulty. Therefore, our next step in the evaluation of Audit-Reengineer will be to apply it to real legacy systems.



    References
    [Audit 98] User Manual Concerto2/Audit-CC++,  Sema Group, 56 rue Roger Salengro, 94126 Fontenay-Sous-Bois Cedex - France , 1998
    [Biem95] J.M. Bieman, B.K. Kang. Cohesion and Reuse in an Object-Oriented System. Proc. ACM Symposium on Software Reusability, April 1995.
    [Ciupke 99] O. Ciupke, S. Demeyer, S. Ducasse, R. Nebbe, T. Richner, M. Rieger, B. Schulz, S. Tichelaar, J. Weisbrod. The Famoos Handbook of Reengineering, Deliverable 5.6.1 of FAMOOS, 1999.
    [CDIF 94] "CDIF Framework for Modeling and Extensibility", Electronic Industries Association, EIA/IS-107, January 1994, online available at http://www.eigroup.org.
    [Chid 94] S.R. Chidamber, C.F. Kemerer. A metrics Suite for Object Oriented Design. IEEE Transactions on Softw. Engineering, Vol.20, No.6, June 1994.
    [Deme 98] Serge Demeyer, Sander Tichelaar and Patrick Steyaert. Definition of a Common Exchange Model , version 1.1, Deliverable A2.4.1 of FAMOOS, 1998. Available on the WWW at: http://www.iam.unibe.ch/~famoos/InfoExchFormat/
    [Hitz 96] M. Hitz, B. Montazeri. Measuring Coupling in Object-Oriented Sys-tems. Object Currents, Vol.1, No. 4, 1996.
    [LiHe 93] W. Li, S. Henry. Maintenance Metrics for the Object Oriented Para-digm. IEEE Proc. First International Software Metrics Symp., pp. 52-60, May 21-22, 1993.
    [Mar 97] Radu Marinescu. The use of Software Metrics in the Design of Object Oriented Systems. Diploma thesis at the "Politechnica" University in Timisoara, 1997.
    [Mar 98] Radu Marinescu. Refined Metrics and Heuristics. Internal FAMOOS achievement, RMETH A2.5.1, 1998.
    [McCa 76] T.J. McCabe. A Complexity Measure. IEEE Trans. Software Eng., Vol. 2, No. 4, December, 1976.
     
     








    If you have comments or suggestions, email me at anne-marie.sassen@sema.es