C# program verification problems: solution by a three-level method

The evolution of formal methods allowed us to overcome many obstacles in verification of procedural programs. However, wide spreading of object-oriented languages has brought new challenges, even in the case of sequential programs. These problems were thoroughly exam-ined by ESC/Java and Spec#, thou...

Full description

Saved in:

Bibliographic Details
Date:	2008
Main Author:	Promsky, A.V.
Format:	Article
Language:	English
Published:	Інститут програмних систем НАН України 2008
Subjects:	Формальні методи програмування
Online Access:	https://nasplib.isofts.kiev.ua/handle/123456789/1479
Tags:	Add Tag No Tags, Be the first to tag this record!
Journal Title:	Digital Library of Periodicals of National Academy of Sciences of Ukraine
Cite this:	C# program verification problems: solution by a three-level method / A.V. Promsky // Пробл. програмув. — 2008. — N 2-3. — С. 313-322. — Бібліогр.: 19 назв. — англ.

Institution

Digital Library of Periodicals of National Academy of Sciences of Ukraine

id	nasplib_isofts_kiev_ua-123456789-1479
record_format	dspace
spelling	nasplib_isofts_kiev_ua-123456789-14792025-06-03T16:04:51Z C# program verification problems: solution by a three-level method Promsky, A.V. Формальні методи програмування The evolution of formal methods allowed us to overcome many obstacles in verification of procedural programs. However, wide spreading of object-oriented languages has brought new challenges, even in the case of sequential programs. These problems were thoroughly exam-ined by ESC/Java and Spec#, though in many cases they just state the presence of the challenge. This paper presents an overview of some problematic issues and a three-level approach to their solution in the C#-light project. Развитие формальных методов позволило решить многие вопросы верификации процедурных программ. Однако, широкое распространение объектно-ориентированных языков выявило новые проблемы даже для последовательных программ. Эти проблемы были детально исследованы в проектах ESC/Java и Spec#, но решение предлагалось в редких случаях. В этой статье рассмотрены некоторые из проблем и их решение с помощью трехуровневого подхода в проекте C#-light. 2008 Article C# program verification problems: solution by a three-level method / A.V. Promsky // Пробл. програмув. — 2008. — N 2-3. — С. 313-322. — Бібліогр.: 19 назв. — англ. 1727-4907 https://nasplib.isofts.kiev.ua/handle/123456789/1479 519.681.3 en application/pdf Інститут програмних систем НАН України
institution	Digital Library of Periodicals of National Academy of Sciences of Ukraine
collection	DSpace DC
language	English
topic	Формальні методи програмування Формальні методи програмування
spellingShingle	Формальні методи програмування Формальні методи програмування Promsky, A.V. C# program verification problems: solution by a three-level method
description	The evolution of formal methods allowed us to overcome many obstacles in verification of procedural programs. However, wide spreading of object-oriented languages has brought new challenges, even in the case of sequential programs. These problems were thoroughly exam-ined by ESC/Java and Spec#, though in many cases they just state the presence of the challenge. This paper presents an overview of some problematic issues and a three-level approach to their solution in the C#-light project.
format	Article
author	Promsky, A.V.
author_facet	Promsky, A.V.
author_sort	Promsky, A.V.
title	C# program verification problems: solution by a three-level method
title_short	C# program verification problems: solution by a three-level method
title_full	C# program verification problems: solution by a three-level method
title_fullStr	C# program verification problems: solution by a three-level method
title_full_unstemmed	C# program verification problems: solution by a three-level method
title_sort	c# program verification problems: solution by a three-level method
publisher	Інститут програмних систем НАН України
publishDate	2008
topic_facet	Формальні методи програмування
url	https://nasplib.isofts.kiev.ua/handle/123456789/1479
citation_txt	C# program verification problems: solution by a three-level method / A.V. Promsky // Пробл. програмув. — 2008. — N 2-3. — С. 313-322. — Бібліогр.: 19 назв. — англ.
work_keys_str_mv	AT promskyav cprogramverificationproblemssolutionbyathreelevelmethod
first_indexed	2025-11-28T03:44:54Z
last_indexed	2025-11-28T03:44:54Z
_version_	1850004213702066176
fulltext	Формальні методи програмування © A.V. Promsky, 2008 ISSN 1727-4907. Проблеми програмування. 2008. № 2-3. Спеціальний випуск 313 UDC 519.681.3 C# PROGRAM VERIFICATION PROBLEMS: SOLUTION BY A THREE-LEVEL METHOD A.V. Promsky Siberian Division of the Russian Academy of Sciences A. P. Ershov Institute of Informatics Systems, ac. Lavrentiev ave., 6, 630090 Novosibirsk, Russia. Fax: 332 3494; phone: 330 8652. E-mail: promsky@iis.nsk.su The evolution of formal methods allowed us to overcome many obstacles in verification of procedural programs. However, wide spreading of object-oriented languages has brought new challenges, even in the case of sequential programs. These problems were thoroughly exam- ined by ESC/Java and Spec#, though in many cases they just state the presence of the challenge. This paper presents an overview of some problematic issues and a three-level approach to their solution in the C#-light project. Развитие формальных методов позволило решить многие вопросы верификации процедурных программ. Однако, широкое распространение объектно-ориентированных языков выявило новые проблемы даже для последовательных программ. Эти проблемы были детально исследованы в проектах ESC/Java и Spec#, но решение предлагалось в редких случаях. В этой статье рассмотрены некоторые из проблем и их решение с помощью трехуровневого подхода в проекте C#-light. Introduction The fully automatic verification of programs is a tempting and hardly accessible goal. The active use of object- oriented languages has raised the difficulty to a new level. New formal methods and specification languages are re- quired, because the classical Hoare approach and first-level logics are no more adequate for the task. This paper purposes two aims. First, it gives an overview of some well-known problems of OO program verifica- tion [1, 2] and, second, it presents a three-level approach to C#-light program verification in which these problems were successfully solved. The three-level approach is an attempt to reconcile two controversial goals. First of all, a wide coverage of C# is a requirement of practical (and, of course, commercial) interest. That is why C#-light includes a great part of C# [3], except for threads and some realization-dependent constructs. The formal definition of C#-light has the form of a struc- tured operational semantics. Secondly, regardless of its limitations, the axiomatic approach [4, 5] is still the best choice. As a result, we chose a compact core of C#-light — the C#-kernel language and developed its Hoare-like logic. The first level of our approach consists in translation from C#-light into C#-kernel. The formally defined set of translation rules admits the proof of equivalence. At the second stage the lazy verification conditions are derived in C#-kernel axiomatic semantics. Laziness results from the fact that these conditions are not the final assertions of the specification language. Indeed, they can contain special terms related to the issues of control transfer and dynamic binding. Refine- ment of these terms is beyond the capacity of the Hoare logic and is performed at the third level. In order to test our method, we verified most programs from the well-known collection [1] illustrating the verifi- cation challenges. Originally written in Java, they suit the C# syntax with minimal changes. We also examined some innovations of C#, such as delegates and events. We also developed some strategies of verification condition simplification. In the case of OO languages, the size of verification conditions can overgrow all reasonable limits, so it becomes an additional challenge. This research has been partially supported by grant 04-01-00114a from RFBR and by grant N 14 for young sci- entists from SB RAS. 1. Preliminaries Let us briefly describe the C#-light project and present information required for further reading. The details can be found in [6 – 9]. 1.1. The C#-light language. As it was said, C#-light is a sequential language. Besides the threads, the following constructs are forbidden: attributes; destructors; the using statement; the checked and unchecked operators; unsafe code; pre-processing directives. Note that constructs listed above either require the low-level knowledge about a concrete implementation of .Net platform, or their contribution to C# expressivity is not significant. However, the C#-light language is still a representa- tive subset of C#. In comparison with Java, it includes properties, events, delegates and indexers. Формальні методи програмування 314 1.2. The assertion language. The choice of an assertion language, which is used for pre-/postconditions and in- variants, plays a crucial role. Indeed, many properties of the Hoare logic depend on the expressiveness of assertions. The classical first-order language is not sufficient for modern programming languages and is a target for replace- ment/extension [1, 10]. In our case, we added functors and some elements of λ-calculus. Types of the assertion language include a universal set U, the set of C# identifiers N, the set of abstract type names T, functions TT ′→ and Cartesian products TT ′× , where T, T′ are types. Let us note that U includes, at least, all C# literals, the set of memory sells L, natural numbers Nat and undefined value ω. Expressions include constants, variables, terms ),...,( 1 nssf and λ-terms ),( sxλ with a standard interpretation. Assertions are boolean expressions built by standard rules. We also fix the function ),,( vcfupd which updates the function f at the argument c with the value v. For convenience we will use the usual infix notation in our examples. 1.3. The C#-kernel language. Strictly speaking, C#-kernel is not a syntactic subset of C# as long as it contains a new class of constructions — meta-instructions. However, semantically we are still in the scope of C#-light. Let us briefly list the main features of C#-kernel: • fully qualified names are used instead of namespaces and using-directives; • the set of statements includes an expression-statement, an if statement, a goto statement and a block; • the operators \|\|, &&, ?:, new and all assignments are forbidden; • the sets of labels, local variable names, local constant names and type names are disjoint and do not contain duplicates within a program; • a method and delegate invocation has a normalized form x.y(z1, ..., zn) or y(z1, ..., zn), where x, y are names and z1, ..., zn are names or literals; • instead of C# assignments, the metainstruction x := e is used; • objects are created by a low-level metainstruction new_instance(); • types are explicitly initialized by metainstruction Init(C); • exceptions are handled by metainstructions catch(T, x) and catch(x); • declarations of fields and constants of classes and structs do not contain initializers. Instead, two methods SFI and IFI are reserved for each class declaration to perform static and instance field initialization, respectively. Those low-level metainstructions should not confuse the reader. C#-kernel is the intermediate language and its use is hidden from the user. 1.4. The C#-light abstract machine. In the classical approach [4], suitable for Pascal, a state of an abstract ma- chine is a mapping from variable names into values. Thus, variable names are treated as unique labels of memory cells, and the state takes the label as an argument and returns its contents. The presence of references makes this approach inconsistent in C# since every change of the state through a label must track all possible aliases. To overcome this obstacle, we use such a notion as memory management. Remember that access by name only exists in the input code. The executable code implies access by address. We introduce this two-level "name → address → value" access into our states. We do not identify the addresses with any concrete type. The uninterpreted symbolic constants are sufficient. We only require that every address be unique. Besides the two-level access, we also need to recognize object types, the internal structure of composite objects and thrown exceptions. Thus, a state is a mapping from the following abstract machine metavariables into their values: Name Type Description L N → L Program variable addresses V L → U Program variable values T TUN →∪ Program variable types L2 LNatNU →∪× )( Composite object structure V0 U Last evaluated value E U Last caught exception We will often use the abbreviation mvs instead of the tuple [L, V, T, L2, V0, E]. 1.5. Translation from C#-light into C#-kernel. The basis for this translation was developed in our work on C program verification [11]. Complex C-light constructs were translated into the sequences of simple C-light constructs. The features of C# require translation of some simple constructs as well. The idea is as follows: if the interpretation of a construct in the operational semantics leads to a series of changes of metavariables, we can replace the original con- struct by an explicit sequence of these changes. That is where the connection between metavariables and metainstruc- tions is established. Indeed, for every metainstruction of the form x := e, x is one of metavariables. For example, the declaration S x; is replaced by new_instance(); L := upd(L, x, V0); T := upd(T, L(x), Loc(S)); Формальні методи програмування 315 The first line results in allocation of a new memory cell, and the new address is stored in V0. In the second line, this new address is associated with the name x. The third line signals that the new cell contains an object of type S. 1.6. Axiomatic semantics of C#-kernel. The classical Hoare logic is a calculus of triples { P} S {Q}, where P and Q are assertions and S is a program. A triple is valid if the truth of the precondition P and termination of S implies the truth of the postcondition Q. The principal factor is that programs and assertions "share" the variables. The truth of assertions is expressed by the sets of states in which assertions are valid. For example, the formula x = 3 corresponds to all states where the memory cell x contains the value 3. The classical Hoare triple for an assignment looks like }{:)}({ PexexP =← , or equivalently ))}()(.({:}{ xxexxxPxexP ′←=∧′←′∃= , (1) where “← ” denotes the substitution. At the first sight, the use of metavariables destroys this concept. However, not only the variables turned into metavariables. The assertion and programming languages changed as well. As a result, we have a classical case: a rather simple programming language, and every program has only six variables (now called metavariables); the assertions over these metavariables are written in the specification language. This leads to a decisive idea: we can adapt the clas- sical Hoare logic for C#-kernel. However, the simplicity of the logic structure is obtained by the price of complex assertions. For example, the formula x = 3 is now expressed as V(L(x)) = 3. We defined the logic of C#-kernel as a forward Hoare logic, when the leftmost construct is interpreted. The dis- advantage consists in the complex quantified form (1) for an assignment. On the other hand, such a system allows us to discard the wittingly true triples. The proof environment, which signals about the current function, initialized classes and exceptions, also greatly reduces the proof. The complete axiomatic semantics for C#-kernel can be found in [7]. 1.7. The lazy verification condition refinement. In practice, the axiomatic semantics is used in the form of a verification condition generator (VCG). In our approach, VCG can produce inconclusive or lazy verification conditions. It results from the problems of dynamic binding and loop invariants displacement. 1) When the user declares a virtual function in a class, he defines the interface to the whole set of overriding functions in the derived classes. The information which function is actually invoked cannot be resolved statically. Thus, the polymorph function invocation in the general case requires quantification over the set of overriding functions [10]. The resolution of such a quantifier, though finite for every finite program, complicates its verification. Nevertheless, the forward Hoare logic proposes a partial solution. Indeed, in this case the information about ob- jects is accumulated in the precondition. When does the precondition "know" the object's dynamic type? First, when the object creation operation precedes the invocation point in one linear fragment. Second, when a user himself provided this information. In this case the invocation rule can use the specification of an appropriate function. We do not include the algorithms, which extract the dynamic type from the precondition, into axiomatic seman- tics. Instead, we use the lazy term CALL() which results in a very simple axiomatic rule: . The formal refining algorithm for CALL is described in [7]. If f is a simple function or f is a virtual function and precondition P "knows" the dynamic type of x, then CALL turns into a conjunction of P and of appropriate function specifications with some substitutes over metavariables. Otherwise, the quantifier over virtual functions will be added. We use the analogous term DELCALL to handle delegates [7]. 2) Loop invariants. The problem is that we rewrite the loops while, do, for and foreach with the help of the goto statement. The treatment of goto in the Hoare logic is also based on the notion of invariant. But in the general case we cannot use the original loop invariant as a new label invariant. In addition, during translation various "forward gotos" can appear and they do not form loops at all. And again, the lazy terms can help us. Every time the axiomatic semantics meets the statement goto L or a statement labeled by L, it introduces the symbolic formula INV(mvs, L). The idea of its refinement is as follows. Among the verification conditions we look for formulas of the form ),( LmvsINVA⇒ , and we take the disjunction of all A. Obviously, this formula is true every time when the control reaches the label L. And then every occurrence of ),( LeINV is replaced by that disjunction with substitution of e instead of mvs. 2. Verification condition simplification The two-level access model resolves the problems of composite objects and aliasing but leads to explosion in the size of verification conditions (VC). Let us consider the following simple example: C#-light program C#-kernel program { int x, y, z; x = 1; y = 2; { Init(int); new_instance(); L := upd(L, x, V0); Формальні методи програмування 316 z = 3; z = 4; } T := upd(T, L(x), Loc(int)); new_instance(); L := upd(L, y, V0); T := upd(T, L(y), Loc(int)); new_instance(); L := upd(L, z, V0); T := upd(T, L(z), Loc(int)); V := upd(V, L(x), 1); V := upd(V, L(y), 2); V := upd(V, L(z), 3); V := upd(V, L(z), 4); } Indeed, the original C#-light program is so simple that it could be verified in the classical Hoare logic [8]. How- ever, in our approach we verify the source program through the implicit verification of an equivalent C#-kernel program in the right column. Let us compare the VCs obtained in each approach. In each case we can use true as a precondition. The postcondition for the classical logic is 421 =∧=∧= zyx . Applying the rule (1) 4 times, we obtain the true assertion: 421 =∧=∧=⇒ zyxtrue . (2) Our C#-kernel logic requires the following postcondition: 4))((2))((1))(( =∧=∧= zyx LVLVLV . Besides the more complex precondition, the number of VCs doubles because of the metainstruction Init. If a type is not yet initialized, this metainstruction triggers the process of static initialization, otherwise it is ignored. Let us consider the VC without initialization. For simplicity we omit the existential quantifiers binding the indexed variables. The predicate SI(int, …) signals that the type int was actually initialized. The proof environment is also discarded because such a simple program does not depend on it. . (3) The size of VC surprises. Obviously, verification of real programs can lead to immense assertions which can eas- ily overcome the capacity of theorem provers. That is the price of the detailed memory model of C#-light/kernel. Thus, we need some simplification strategies which must precede the proof stage. Strategy 1. Let us have a close look at the boxed conjuncts. The formula ω=E signals that the value of the metavariable E is undefined. In other words, no exception was thrown during the execution. In this example the omitted environment also states the absence of exceptions, so this formula is true indeed. Analogously, handling of Init(int) adds the type int to the set of initialized types in environment. Thus, the formula SI(int, L1, V1, T1, L2) is odd. Further, the equalities of the form variable1 = variable2 are just renamings. We can prolong one of the variables. Finally, all we need is to proof that the values of x, y and z are 1, 2 and 3, respectively. This proof requires the information about metavariables L and V, and the types are irrelevant. Moreover, the consequent does not contain any occurrence of the metavariable T. Thus, we can omit the conjuncts for T2, T3 and T. To justify such a discard, we have to show that the formula ))(),(,())(),(,,())(),(2,( 332312 intzintyintx LocLTupdTLocLTupdTLocLTupdT =∧=∧= is valid. This can be easily done. Strategy 2. In the consequent of VC, the variables are matched against the constants. So, we can split VC into three formulas in accordance with the number of conjuncts in the consequent. Let us consider one of these new formu- las (strategy 1 is supposed to be already applied): 4))(( )4)(( )3)(()2)(()1)(( )()()( )2()2()2( 4 342312 33223112 111122133 LV , , LV updV , , LV upd V, , LV upd V, , LV upd V , d, L updL , d, L upd L, d, L upd L , L, V, Ldnewp, L, V, Ldnewp, L, V, Ldnewp =⇒               = ∧=∧=∧= ∧=∧=∧= ∧∧∧ z z zyx zyx . Формальні методи програмування 317 Strategy 3. The type int is not a reference type, and C#-light does not support pointers. This guarantees that the object z does not have aliases, i.e. the assignments to x and y affect z only if the expressions over x and y are assigned to z. Let us begin to unfold the antecedent from the rightmost conjunct. If a term upd binds z with a value which is in- dependent from x and y, then this upd can be substituted into the consequent. Otherwise, the upd is ignored. [ ] . 4)))(()(4))((( )2()2()2( 33334 111122133 z, z, dLupd, z, z, dL, updVupd , L, V, Ldnewp, L, V, Ldnewp, L, V, Ldnewp = ⇒ ∧∧ (4) The standard semantics of upd (Section 2.2) provides the truth of this assertion. Though (4) is still bigger than (2), it is significantly simpler than the original VC (3). Moreover, the proof does not depend on the fact that 1d , 2d and 3d are new addresses (expressed by predicate newp), so the whole antecedent can be discarded. In the next Section we will demonstrate the VCs after the application of simplification strategies. 3. Some verification challenges and their solution The majority of examples are borrowed from [1]. Originally written in Java, they were translated into C#. Both languages are syntactically (and semantically) close, so the translation is trivial. The volume of the paper does not allow us to represent here all of them. Let us consider the most interesting. The remaining cases can be found in [9]. 3.1. Aliasing. Aliasing can make the classical Hoare logic fully inconsistent. For example, if x and y point to the same memory object, then the following Hoare triple becomes incorrect: }1{}1{ === xx 2:y . The explicit memory modeling allows us to find a solution. Consider the following program: class C { C a; int i; C(){ a = null; i = 1; } } class Alias { int m(){ C c = new C(); c.a = c; c.i = 2; return c.i + c.a.i; } } Here, the field i is accessed via the object c, as well as via an aliased reference to c itself in the field a. The specifications for program methods are as follows: Pre(C.C) : true , Post(C.C) : 1))(())(( =∧= iLVnullaLV , Pre(Alias.m) : true , Post(Alias.m) : 40 =V . Verification of the constructor C() is not of great interest, so let us consider the method Alias.m(). In our axiomatic semantics, treatment of the method invocation (here, C()) doubles the proof tree. In combination with the explicit exception propagation [9] this leads to generation of 16 verification conditions. However, 14 VCs are tautolo- gies as long as they have form Φ⇒false . Two remaining must be proved. Let us consider one of them (after simplifi- cations): 40 ))2(())1((0 )))),)),(((2((2,2,())),(((2,1,( )2),)),(((2,( )))((),)),(((2,()))((),(,( =⇒             += ∧′′′′=∧′′=′′ ∧′′′′′′′= ∧′′′′′′′′=′′′∧′′′′=′′ V yLVyLVV iacLVLVLyLupdLicLVLyLupdL icLVLVupdV cLVacLVLVupdVxLVcLVupdV . The standard semantics of upd guarantees its truth. The second VC is similar and can be checked by analogy. 3.2. Breaking out of a loop. The exit from a loop without a loop condition check is usually performed via a break statement (rarely, via goto). This complicates the underlying control flow semantics. In our approach the break statements are replaced by gotos. This results in appearance of new labels. In Section 2.7 we discussed how to tame these labels and their invariants. Формальні методи програмування 318 The input C#-light program looks like this: class C { int[] ia; void NegateFirst(){ for (int i = 0; i < ia.Length; i++){ if (ia[i] < 0){ ia[i] = -ia[i]; break; }} } } When the first negative element is reached, its sign changes and the loop aborts. The unconditional exit from a loop in combination with a potential array update leads to complication of specifi- cations: The original array content is stored in an auxiliary variable old. Let us consider one of verification conditions. It corresponds to execution from the beginning of C.NegateFirst() up to the loop entry point. [ ] )(Inv)0),(,()),(],,0,2,,,[],,[,,( forb0x0iint ⇒′′′=∧Φ′′′< VLVupdVmvsEVLTVLCALL λ where )0),(,( ))))0),(,()))),(((2(0 )],,0,2,,,([,( ],,0,2,,,[[],,,( VLVupdV VLVupdVLVLVV EVLTVLINVmvs EVLTVLCALL x0 x1iathis L1 x1get_Length ′′=∧           ′=∧′′= ∧′′ ′′ ≡Φ λ . To unify the semantics, the standard operations are used in the syntax of the method invocation. That is why there are the terms CALL for the relation < and array property Length. They assert that the value of the counter i is less than the array length. This length is stored in a new program variable x0 which appears during translation into C#- kernel. Also, note the presence of a lazy invariant for a new label L1. Its refined value is the conjunction of two asser- tions. The first one is the precondition Pre(NF()), where the program variable i is associated with the value 0. The sec- ond one is the loop invariant Inv(for) propagated (as a proof precondition) through the loop body. Note that if the end point of the loop body is reached, then for the current value of i the elements ia[0], ..., ia[i-1] are not negative, i.e. the loop was not aborted via break. Thus, the VC antecedent states that either we have a zero-length array or the loop invariant is not affected by the counter increment. By induction, this implies the consequent Inv(for). 3.3. Static initialization. A static field/method of the class belongs to the class itself. In particular, it can be used even if there is no instance of the class. In C# (as well as in Java) static initialization is lazy. For example, in Java a class is initialized at its first active use. Thus, if initialization causes some nontrivial processes in a program, then the user must precisely know where it will take place. C# makes the situation even worse as long as a class is initialized immediately at the first active use only if a static constructor exists. Otherwise, initialization takes place at an implementation-dependent time prior to the first ac- tive use [3, §17.4.5.1]. This results in nondeterminism of the class initialization order. Let us consider the following example: class C { static bool r1, r2, r3, r4; static void m(){ r1 = C1.b1; r2 = C2.b2; r3 = C1.d1; r4 = C2.d2; } } class C1 { public static bool b1 = C2.d2; public static bool d1 = true; Формальні методи програмування 319 } class C2 { public static bool d2 = true; public static bool b2 = C1.d1; } In Java the situation is deterministic though it can be a surprise for a user. The first assignment in the body of the method m() triggers initialization of the class C1, which in turn triggers initialization of the class C2. During initializa- tion of C2, initialization of C1 suspends and the default field values are used. As a result, static field C2.b2 is set to false and all other fields are set to true. If the first two strings in the method body of m() are switched, the class C2 will be initialized before the class C1, resulting in all fields getting the value true. In C# the order of initialization is unknown even though the compilation in Visual Studio provides the Java's re- sult. So, in the general case, this program is indeed a challenge, but we can use the static constructors. Let us add the declarations static C1(){} and static C2(){} to classes C1 and C2 respectively. It will guarantee the needed behavior. The program specifications are as follows: Pre(C.m) : true Post(C.m) : ))(())(())(()(( r4r3r2r1 LVLVLVLV ∧∧¬∧ According to the definition of C#-kernel (see Section 2.3), the field declarations do not contain initializers. Dur- ing translation from C#-light, the static field initializers move into the special methods SFI which are created for each class automatically. Because these methods cannot be specified a priori, their semantics consists of inline body substitu- tion. As a result, the VCs can grow significantly: . However, the proof is straightforward. 3.4. Overriding and dynamic types. Consider the following program: class C { virtual void m() { m(); } } class D : C { override void m() { throw new System.Exception(); } void test() { base.m(); } } At first sight, it looks like the method test() will loop forever. In reality it is not the case: the method test() calls the method m() from the class C, which calls the method m() from the class D, since the reference this has the runtime-type D. It should be noted that in Java the program behavior can be clearer for a reader, because exceptions thrown from the methods can be expressed explicitly. In particular, in [9] all three methods are accompanied by the note throws Exception. The specifications are as follows: Pre(D.m) : true, Post(D.m) : T(E) = SE, Pre(D.test) : T(V(this)) = D, Post(D.test) : T(E) = SE. Pre(C.m)(mvs, a) :             ⇒= ∧′⇒′ ∧′∀∀⇒≠ ∧=∧=∧ ))(( ))))(,)(([]),,,,( )[],,,,((()(( )()())((),( 0 trueC this thisC thisC.m aTOT arestsmvfprefmvssmvsubst mvsmfinvokersmvfaTOT amvsmvsaTOTVTmvsprec ; Post(C.m)(mvs, a) :       ⇒= ∧⇒∀⇒≠ ))(( ))))(,)(())([],,,,(()(( 0 falseC thismC aTOT arestmvsfpostamvsfinvokerfaTOT . Here a stands for the tuple of specification parameters for C.m() and TOT(a) denotes the type of this. Also mvs0(a) stores the initial values of metavariables from mvs. For brevity, we use the name SE instead of Sys- tem_Exception. The postcondition of D.Test() states that E stores the uncaught exception of type SE. The specifi- Формальні методи програмування 320 cation of C.m() represents the case analysis. If the type of this is C, then the invocation m() leads to the endless loop ))(( false⇒= CaTOT . Otherwise, the invocation m() is the invocation of a proper implementation of C.m() from some derived class. The collection of all those implementations is quantified by the variable f which satisfies the predi- cate invoker. This predicate is a logical representation of the late binding algorithm. The classes C and D do not contain the fields. As a consequence, the intermediate C#-kernel program is very short and can be represented as follows: class C { virtual void m() { this.m(); } } class D : C { public void IFI_D() { Init(C); this.IFI_C(); } override void m() { Init(System_Exception); new_instance(); L := upd(L, x, V0); T := upd(T, L(x), Loc(System_Exception)); new_instance(); V := upd(V, L(x), V0); T := upd(T, V(x), System_Exception); x.IFI_System_Exception(); x.System_Exception(); E := V(x); } void test() { base.m(); } } Note that the initializing method IFI appears in D (the corresponding method in C turns to be empty). The name spaces are eliminated during translation, so the full name System.Exception is replaced by the global level name System_Exception. Two VCs are generated for D.test() depending on the fact that the exception is/(is not) raised during the invo- cation base.m(). The VC without exception is tautology, since it has the form Φ⇒false . The VC with exception looks like this: SEETEVTmvsprecmvsmvsCALL =⇒≠∧=∧ )())))((),(,(,[],,,( ωλ DthisD.testbasem . The formula prec() is automatically added by the generator for every method to represent some standard assump- tions about the class and this. However, it does not affect the truth of this VC and we can omit its definition. The details of refinement of this VC can be found in [9]. 3.5. Delegates. A delegate is a class that encapsulates a function signature. In comparison with the C function pointers, the higher level of safety is provided. In the general case, a delegate points at a list of functions, though this feature is rarely used. Let us consider the following program: class Minimum { public delegate bool Order(int e1, int e2); static public int Find(int[] arr, Order ord) { int min = arr[0]; int xx = 0; while(xx < arr.Length){ if(ord(arr[xx], min)) min = arr[xx]; xx++; } return min; } } class C { static bool LessThan(int e1, int e2) { return e1 < e2; } static void Main(string[] args) { Minimum.Order order = new Minimum.Order(LessThan); int[] arr = new int[] {3, 5, 1, 7, 4}; Формальні методи програмування 321 Minimum.Find(arr, order); } } The class Minimum declares the method Find(), which should find the minimal element in an array of integers. The method which defines the ordering relation is passed to the method Find through the delegate parameter (compare with the C standard library function qsort). Here we use the usual " < " relation. The program specifications are as follows: When we specify the method Minimum.Find(), we come across the same problem as for virtual methods. To avoid the limited applicability of Minimum.Find(), we should not specify the relation "<" as the only possible parame- ter. Thus, the parameter is specified by the abstract logical function somef without any assumptions about its nature. Among the dozens of VCs we can consider the following (after simplifications): )()0),(,()),(],,0,2,,,[],[,,( 44 Lxxxxint INVVLVupdVmvsEVLTVLCALL ⇒=∧Φ++ λ (5) where       =∧=∧= ∧Φ ≡Φ )),(2),(,())(()0),(,( ),(,],,[)),((( 34323 22 xxarrminbb minx0ord VLVupdVLVVLVupdV mvsmvsLVDELCALL true λ .                 =∧=∧= ∧ = ∨⇒≤∀ ∧≤∧∧ ≡Φ )),(2),(,(),,(),,( ) )))),(((2())(( ))))),(((2()),((())((.( )))),(((2())(()2,,,()2,,,( 1222112 11111 1111111 11111111122 2 xxarrx0x0b arrmin arrminxx Lengtharrxx VLVupdVdLupdLdLupdL kLVLVLV kLVLVLVsomefLVkk LVLVLVLVLdnewpLVLdnewp . It corresponds to the body of the while loop and contains all variants of lazy terms: CALL, DELCALL and INV. The proof sketch is as follows. The enclosing CALL corresponds to the fragment xx++. The assertion V0 = V(L(arg)) + 1 can be taken as a postcondition of increment. Note that the side effect of increment is expressed in the second con- junct in (5). During the refinement of CALL, the name V is replaced by V4 and Φ filters outwards. Here, the delegate points at the single function, so the refinement of DELCALL resembles the refinement of CALL. Thus the adapted dele- gate specifications are added to 2Φ . Finally, it is easy to establish that INV(L) = Inv(while)(V0 ← V(L(min))). As a result (5) asserts that the loop invariant is preserved, when the counter xx increases. The proof is by induction. 4. Related work Interesting results on semantics formalization for Java and C# are described in [10, 12, 13]. As a rule, these pa- pers either propose a detailed low-level semantics, inconvenient for verification, or verification is the initial goal but the language coverage is poor. The LOOP project [14] was started in 1997 at the University of Nijmegen. Apart from threads and inner classes, the majority of Java features are covered. The LOOP tool is effectively a compiler, which takes a Java program and its JML specifications as input. As output, it generates several lemmas in the syntax of the theorem prover PVS. Semantics of objects and classes is based on the algebraic approach. Most of the case studies are so-called Java Card programs designed to run on smart cards. The main shortcoming is that the tool works effectively only for small programs. Another well-known example is the ESC/Java project [15] which supports a wide Java subset. A subset of JML is used for specifications. A wp-calculus is used as the semantics description, and a theorem prover searches for poten- tial bugs. However, the full functional correctness was not the aim, and the developers have deliberately chosen an un- sound and incomplete approach to maximize the number of typical bugs that the tool can spot fully automatically. The Spec# system [16] looks promising. It integrates into Visual Studio and .NET Framework, so the complete infrastructure, including libraries, designing and editing tools, is provided. The Spec# language is a superset of C#. It Формальні методи програмування 322 provides user specifications, non-null types and some means for high-level data abstraction. The specifications become a part of a program and can be checked dynamically. The static check in the theorem prover Boogie is also supported. The problem of verification condition complication affects the practicability of a verification system. In [17] Luckham and Suzuki considered the issue of the combinatorial size explosion, when the update function upd is used for arrays and pointers. Some algorithms for reducing the recursive upd invocations were proposed. The two-stage VC generation algorithm has been implemented as a part of ESC/Java [18]. The first stage trans- lates a source fragment into an assignment-free, or passive, form. The second stage uses a VC generation technique that is optimized to exploit the assignment-free nature of the passive form. This two-stage algorithm creates a VC, whose size in worst-case quadratic in the size of the source fragment, and in practice appears to be close to linear. Another approach consists in simplification of VCs before their proof. Gribomont [19] proposed a strategy for the propositional case, corresponding to digital circuits and concurrent synchronization algorithms. Efficiently comput- able criteria allow one to detect and discard provably irrelevant parts of boolean VCs. Conclusion In this paper we have described some C# program verification challenges and proposed the solution techniques in the framework of the C#-light project. These techniques form the tree-level approach which extends our two-level approach to C program verification [11]. The advantages of C#-light and of the approach are as follows: • The C#-light language supports the major part of sequential C#. • The verification process is based on a simple Hoare-like logic. The simplicity results from translation of se- mantically difficult C#-light constructs into C#-kernel and postponement of handling some dynamic aspects until the refinement stage. • Unambiguous inference of lazy verification conditions by means of forward proof rules considerably reduces the number of generated lemmas. The complexity of verification conditions is not usually proclaimed as a challenge. However, our detailed mem- ory model can lead to the combinatorial explosion in the length of terms. Therefore we proposed some strategies which considerably simplify the verification conditions. The three-level approach is promising for applications. We are developing an experimental tool for C#-light pro- gram verification including C#-light to C#-kernel translator, verification condition generator as well as lazy verification condition refiner. It is supposed to use a static analyzer to check applicability of simplification strategies. Our list of challenges is not complete. The problems such as termination, class invariance, low-level arithmetic, as well as some advanced features mentioned in [2], can form the framework of future research. 1. Jacobs B., Kiniry J.L., Warnier M. Java Program Verification Challenges // Lect. Notes Comput. Sci. – 2003. – Vol. 2852. – P. 202–219. 2. Leavens G.T., Leino K.R.M., Muller P. Specification and verification challenges for sequential object–oriented programs // TR#06–14a, Dept. of Computer Science, Iowa State University, 2006. 3. C# Language Specification. Standard ECMA-334, 2001. 4. Apt K.R., Olderog E.R. Verification of sequential and concurrent programs. – Berlin etc.: Springer, 1991. – 450 p. 5. Hoare C.A.R. An axiomatic basis for computer programming // Commun ACM. – 1969. – Vol. 12, N 1. – P. 576–580. 6. Dubranovsky I.V. C# program verification: the translation from C#–light into C#–kernel. – Novosibirsk, 2006. – 56 p. – (Prep. IIS SB RAS; N 140). 7. Nepomniaschy V.A., Anureev I.S., Dubranovsky I.V., Promsky A.V. Towards C# program verification: A three–level approach // J. "Programmi- rovanie" Russian. – 2007. – N 4. – P. 1–16. 8. Nepomniaschy V.A., Anureev I.S., Promsky A.V. Towards C# program verification: C#–kernel and its axiomatic semantics // Proc. CS&P'2006. – Humboldt University, Berlin. – Specification. – 2006. – Vol. 2: – P. 195–206. 9. Promsky A.V. Application of three–level approach to C#–light program verification. – Novosibirsk, 2006. – (Prepr. / IIS SB RAS; N 139). 10. Oheimb D.V. Hoare Logic for Java in Isabelle/HOL // Concurrency and Computation. – 2001. – Vol. 13. – P. 23 – 36. 11. Nepomniaschy V.A., Anureev I.S., Mihailov I.N., Promsky A.V. Verification–Oriented Language C–light // System informatics. – Novosibirsk, 2004. – Vol. 9. – P. 51–134. 12. Borger E., Fruja N.G., Gervasi V., Stark R. A High–Level Modular Definition of Semantics of C# // Theoretical Computer Sci. – 2004. – N 336(2/3). 13. Poetzsch–Heffter A., Muller P. A Programming Logic for Sequential Java // Lect. Notes Comput. Sci. – 1999. – Vol. 1576. – P. 162–176. 14. Jacobs B., Poll E. Java Program Verification at Nijmegen: Development and Perspective // Lect. Notes Comput. Sci. – 2004. – Vol. 3233. – P. 134–153. 15. Leino K.R.M. Extended Static Checking: a Ten–Year Perspective // Lect. Notes Comput. Sci. – 2001. – Vol. 2000. – P. 157–175. 16. Barnett M., Leino K.R.M., Schulte W. The Spec# programming system: An overview // Lect. Notes Comput. Sci. – 2004. – Vol. 3362. – P. 49–69. 17. Luckham D.C., Suzuki N. Verification of array, record and pointer operations in Pascal // ACM Trans. Progr. Lang., and Systems. – 1979. – Vol. 1, N 2. – P. 226–244. 18. Flanagan C., Saxe J.B. Avoiding Exponential Explosion: Generating Compact Verification Conditions // ACM Press, January 2001. – P. 193–205. 19. Gribomont P.E. Simplification of Boolean verification conditions // Theoretical Computer Sci. – 2000. – Vol. 239, N 1. – P. 165–185.

C# program verification problems: solution by a three-level method

Institution

Similar Items