The denotational semantics of a programming language specifies the value or output from a program given a formal definition. In this sense when we see the expression 5 + 2, we only think of this as the value 7, all expressions that evaluate to the same thing are considered equivalent. In this sense, we look past the notation.

For example if we define [[e]] to be the meaning of e, then we would be able to write things like

[[v]] = v

[[(+ e1 e2)]] = [[e1]] + [[e2]]

[[(- e1 e2)]] = [[e1]] - [[e2]]

This would be representing denotational semantics because we don't care how things are evaulated no matter how complicated the expression is, we only care about what it evaluates to and properties of that evaluated value.

[[equal? e1 e2]] = #t if [[e1]] = [[e2]], #f otherwise
[[if e1 e2 e3]] = [[e3] if [[e1]] = #f, [[e2]] otherwise ; on the right we use meta logic symbols

As you can see, by using denotational semanics, we can quickly and easly write down how things should evaluation, where the details don't matter so much. At the same time this is a problem because if we want to actually implement things on a computer, we need to know the details.

Operational Semantics

The operational semantics of a programming language describes the process by which the value of a program is computed, e.g) how evaluation actually takes place.

When working with operational semantics, we characterize executions as evaluation derivations.

Judgement

$e\downarrow v$ claims that the expression $e$ evalutes to $v$

For ease, we will use e |_ v for judgment. For any value we have that $v\downarrow v$.

We will also have rules, or inference rules, which allow us to synthesize new information about evaluations, for example the rule for adding is given by:

e1 |_ v1 & e2 |_ v2 & v = v1 + v2
---------------------------------
(+ e1 e2) |_ v (ADD)

With these tools in place we can prove that in our evaluation system a given expression evalutes to a particular value. For example, we can prove that (+ 3 (+ 5 4)) |_ 12 as follows

We know that 5 |_ 5 and 4 |_ 4 as this is true for any value, in regular arithmetic we have 9 = 5 + 9 therefore by the ADD inference rule we can deduce that (+ 5 4) |_ 9, now we can try doing the same thing, 3 |_ 3 and (+ 5 4) |_ 9, and we know 12 = 9 + 3, therefore by the ADD inference rule we deduce that (+ 3 (+ 5 4)) |_ 12.

We can generlize this rule for any binary operation (and any n-ary operation)

e1 |_ v1 & e2 |_ v2 & v = v1 op v2
---------------------------------
(op e1 e2) |_ v (GENERIC-BOP)

If we take after racket's method for handling if then statements then we have the following rule

Note that in racket we don't require e1 to be a boolean, which is modeled in the inference rule (IF-T). Alternatively if we were designing a different language, we could have done it like this

e1 |_ #t & e2 |_ v2 & v2 is a bool
----------------------------------
(and e1 e2) |_ v2

Environments

When defining things, we are saving information to memory, we can generalize this idea with what's known as an environment, the environments grammar is this:

E ::= . (empty environment)
| x -> v, E (one binding, rest of environment)

Where x is any legal identifier and v is any values. for example it could be x1 -> 3, x2 -> 5, ... x25 -> 39

With this system in place we can make judgments under an environment, notated like this: E |- e |_ v which stands for "e evaluates to v under the environment E"

We also have lookup notation: E(x) which returns the most recent value of v that x binds to in E. If there is no binding for x, a name error occurs, so we have the following inference rule

E(x) = v
--------
E |- x |_ v (VAR)

Let's redo the generic binary operation wrt environments:

E |- E1 |_ v1 & E |- e2 |_ v2 & v = v1 op v2
--------------------------------------------
E |- (op e1 e1) |_ v

Which states the same thing, but mentions that an identifier evalutes to a certain value under the given environment, and concludes that under that same environment the entire thing evalutes to the value v.

We also allow for updating an environment: E |- s ||_ E' which states "after executing statement s under E, the new environment is E'.

E |- e |_ v & E' = x -> v, E
----------------------------
E |- (define x e) ||_ E' (DEFINE)

Under the current environment, e evalutes to v, then by evaluating (deifne x e) our environment becomes the initial one with the extra binding x -> v. If we want to have an immutable definition we can do the following

E |- e |_ v & x notin E & E' = x -> v, E
----------------------------
E |- (define x e) ||_ E' (DEFINE)

In this case if we tried to redefine an immutable variable it would get stuck because there is no valid inference rule to evaluate in this situation.

We can model lambda calculus rules by saying that lambda expressions (λ (x) e) can be evaluated to λx.e and have substituation notation of the form [e2/x]e1, which means "replace every occurence of x in e1 with e2". THen we have the following function application rule:

E |- e1 |_ λx.e3 & E |- e2 |_ v2 & E |- [e2/x] e3 | v
------------------------------------------------------
E |- (e1 e2) |_ v

Which first evalutes e1 into a lambda value with expression e3 as body, then if the argument evaluates to v2, and in the environment if substituting x with v2 in e3 results in a value v, then the entire function application in that environment results in v.

Let's design a simple imperiative proramming language that has variable assigment, if statement sand while loop statements with only integer values.

e ::= x | (e1 op e2)
s ::= x = e
| s1; s2
| if (e) s1 else s2
| while (e) s1

Note that s1; s2 tries to model the ability to run things in succession

With this language we can determine how every expression should be evaluated

E |- e1 |_ v1 & E |- e2|_ v2 & v = v1 op v2
-------------------------------------------(BIN-OP)
E |- (e1 op e2) |_ v
E(x) = v
-----------(VAR)
E |- x |_ v
E' = x -> v, E & E |- e |_ v
----------------------------(ASSIGN)
E |- x = e ||_ E'
E |- s1 ||_ E' & E' |- s2 ||_ E''
---------------------------------(SEQ)
E |- s1;s2 ||_ E''

The last inference rue here says that if by executing s1 it transforms E to E', and by executing s2, in the new environment E', then you get the environment E'', then by executing them sequentially, it will transform E to E'', (transitivity)

E |- e |_ 0 & E |- s2 ||_ E'
-----------------------------(IF-F)
E |- if (e) s1 else s2 ||_ E'
E |- e |_ v & v != 0 & E |- s1 ||_ E'
-----------------------------(IF-F)
E |- if (e) s1 else s2 ||_ E'

Note that we no longer make statements about evaluation of things but rather about how the environment changes, this is because in an imperitive programming language things may not return values.

E |- e |_ 0
-----------------------(WHILE-END)
E |- while (e) s1 ||_ E
E |- e |- v & v != 0 E |- s1 ||_ E' & E' |- while (e) s1 ||_ E*
----------------------------------------------------------------(WHILE-CONT)
E |- while (e) s1 ||_ E*

The last rule here check if another iteration occurs, and if it does then it figures out how the environment changes after the whole loop terminates, and then says the entire thing updates the env to that.

Specifically this says if under the env E, we know that e evalutes to v, and v != 0, (then we should continue) and by evaluating the statement s1, we obtain the new env E', and we know that under this new env while (e) s1 yields some new env E* (this is where the recursion occurs), then we know that the entire loop should evaluate update our environment to E*.

If we created an infinite while loop then we can't use WHILE-CONT because we would not be able to figure out the judgement E' |- while (e) s1 ||_ E* [but what if the environment never changes?]

We can also augment this to support memory load and store, like most impreative programming languages have, so we can try to add pointers to our grammar:

e ::= x | (e1 op e2) | *x
s ::= x = e
| *x = e
| s1; s2
| if (e) s1 else s2
| while (e) s1

We'll have to augment our environment to (E, m) where E is the original variable binding list and m is a map: Int to Int, that maps integer address to their corresponding stored integer values. We'll denote the syntax m[a -> b] to be a new map with the sam mappings as m except the address a is now mapped to b. Initially we'll denote o/ as the empty map.

E(x) = v
----------------(VAR)
(E, m) |- x |_ v
E' = x -> v, E & (E, m) |- e |_ v
---------------------------------(ASSIGN)
(E, m) |- x = e ||_ (E', m)
E(x) = v1 & m(v1) = v2
----------------------(LOAD)
(E, m) |- *x |_ v2
m' = m[v2->v1] & (E, m) |- e |_ v1 & E(x) = v2
----------------------------------------------(STORE)
(E, m) |- *x = e ||_ (E, m')