| Rudolf Ferenc University of Szeged ferenc@cc.u-szeged.hu |
Arpad Beszedes University of Szeged beszedes@cc.u-szeged.hu |
Tibor Gyimothy University of Szeged gyimi@cc.u-szeged.hu |
Section 1. IntroductionColumbus is a reverse engineering framework, which has been developed in cooperation between the Research Group on Artificial Intelligence in Szeged and the Software Technology Laboratory of Nokia Research Center. Columbus is able to analyze large C/C++ projects and to extract their UML class model as well as conventional call graphs. The main motivation for developing the Columbus system has been to create a general framework for combining a number of reverse engineering tasks and to provide a common interface for them. Thus, Columbus is a framework tool which supports project handling, data extraction, data representation, data storage, filtering and visualization.Team members:
Section 2. Experience ReportColumbus can analyze preprocessed C++ source code (for non-preprocessed files it invokes an external preprocessor), so the first thing to do was to get Borland C++ Builder 5. After successfully preprocessing Sortie, we begun the analysis. The first attempt did not bring the expected results, because Sortie's GUI uses the VCL (Visual Component Library), which heavily uses Borland's C++ extensions (e.g. keywords like '__property' and '__published'). On the other hand, Columbus handles "only" ANSI C++ and Microsoft's extensions. So we had to extend Columbus to handle Borland's dialect. In the mean time we sustained great efforts to make our C++ schema better. We learned a lot from our common paper with Susan Elliott Sim, Richard C. Holt and Rainer Koschke: "Towards a Standard Schema for C/C++" (to be presented at WCRE 2001). We had also productive discussions with Jürgen Ebert, Andreas Winter and Volker Riediger from the University of Koblenz-Landau. The resulting C++ schema seems to be a lot more useful than the old one (it will be documented and published shortly). We extended Columbus as well, so it can now export its AST into GXL according to our C++ schema. As soon as the C++ analyzer was ready, we parsed the Sortie source code and exported the results into GXL. The result is a 66MB large file (2546 classes, 15235 functions and 14054 attributes) that contains all information including STL and VCL! (The file was validated according to the GXL DTD.) Because it is very hard to deal with this amount of data, we filtered the AST to include only the classes from Sortie source code. The result is a 3MB large file (69 classes), which is unfortunately not valid, because there are references to types, which come from the standard headers that have been filtered out. We send these files attached (sortie-gxl-in-columbus-schema-full.zip and sortie-gxl-in-columbus-schema-filtered.zip). Please note, that Columbus does not yet deal with function bodies (statements and expressions), but it is an ongoing work; and the GXL export is not complete, i.e. it does not export template parameter lists (all type references to template parameters are pointing to a dummy typedef). Section 3. Collaboration PartnersBecause we joined the collaborative demonstration a little bit late, we did not have collaborative partners. We are placing our results at everybody's disposal for further analyses. Section 4. Solution to tasksThe task of Columbus is to parse the C++ source code and to produce input data for other tools (eventually to filter the data). Therefore, solutions for reengineering the Sortie system comprise of combining Columbus with these tools (e.g. visualisers and remodularisers).Participation at WCRE 2001We will demonstrate Columbus on the Tools Fair at WCRE 2001. |