Notional Machines

This blog post was inspired by our work at the Dagstuhl seminar “Notional Machines and Programming Language Semantics in Education“. We spent wonderful five days discussing the definition, roles and, design of Notional Machines. I would like to thank the organizers of the event, Mark Guzdial,
Shriram Krishnamurthi, Juha Sorva and Jan Vahrenhold for this great opportunity to be with so many of the researchers which I admire.

If you are looking for an introduction to Notional Machines, Juha Sorva wrote an excellent review of the topic some time ago. The content of this blog post is motivated by our work in the second part of a breakout group at Dagstuhl where I, Otto Seppälä, Paul Denny and Brett Becker reflected about the design of a Notional Machine.

Before posting my definition, I would like to point that we had more than 40 researchers from many different fields and each one will have a different perspective on the topic. For example, Ben du Boulay (who coined the term Notional Machine), gave his very simple definition:

“A Notional Machine is the best lie we tell students [about how the machine works]”

Ben du Boulay

Ben Shapiro offers a more refined definition on his guest post in Mark Guzdial’s blog:

A notional machine is an explanation of the rules of a programmable system. The rules account for what makes a program a valid one and how a system will execute it.

Ben Shapiro

Long story short, a Notional Machine is a simplification – a necessary simplification so we can understand how a program behaves in a system. After all, not even experienced programmers (with very few exceptions) truly knows what happens at the operational system and hardware levels when a program is executed. The point is, how big is the lie and how we tell it.

While Ben Shapiro’s definition is clear and concise (you can read about his reasoning in the blog post), I would like to present a definition that more clearly addresses the intention behind the design and usage of a Notional Machine:

“Notional machines are artifacts intentionally designed to serve the pedagogical purpose of representing and explaining the behavior of a computational system. A notional machine uses terminology and abstraction levels aimed at a particular audience to support their practices in a particular context. It is often a simplification and can be communicated in different formats.”

In my definition, the main goal of a Notional Machine is still to present the rules of a system. However, I would like to make explicit what makes a representation of behavior (or semantics) a Notional Machine or not. Is a metaphor a Notional Machine? Is a simple oral explanation a Notional Machine? If we use only the main goal of explaining the behavior of a system, metaphors and explanations accomplish such goal, so, yes they are Notional Machines. But if every explanation is a Notional Machine (and we routinely, in one way or another, use explanations or examples to present the behavior of a system) what is the usefulness of the concept?

How to design a Notional Machine?

Perhaps a more pertinent question should be, what makes a good or bad Notional Machine, and to whom. That is why it is extremely important to convey the intention and process behind the design of the Notional Machine. Notional Machines should serve the pedagogical purpose of communicating a complex system to an audience, and Notional Machines should be intentionally designed with this goal in mind. The audience will dictate the terminology used to describe the behavior and the granularity of the abstraction and detail in this Notional Machine. Finally, the Notional Machine can be presented textually (e.g. formal semantics, textual representations), visually (e.g. code visualization, program simulation) or even orally (we do it in classes!) using a variety of tools: examples, metaphors (good and bad ones), images, physical experiences, and so on. The image below represents the process of designing a Notional Machine:

Semantics

The semantics are tied to a specific system (or more commonly, a programming language such as Java, Python or Racket). The completeness of the semantics will define how much of this particular language will be covered by the Notional Machine since in most cases is not necessary to cover all of its behavior. This is a pedagogical decision that should be tied to instructional design and the goals this Notional Machine should achieve. For example, when introducing a Notional Machine of C++ to Arduino users, it is only necessary to cover a subset of the language. In extreme cases, a formal definition of the language is a complete Notional Machine, probably aimed at experts users.

While completeness is a pedagogical decision guided by the target system and audience, the Notional Machine should be as sound as possible to avoid introducing misconceptions. That won’t be always possible but should be our design goal. In practice, the interplay between completeness and soundness gets more complicated when designing a progression of Notional Machines.

The idea behind a progression is based on the Theory of Conceptual Change. We do not introduce all features of the language at the same time to students (that would overload them), so in most cases a Notional Machine in pieces, where fidelity decreases at first in order to form basic schemas and later increases to form more complete mental models (e.g. see the Semantic Waves conceptualization when introducing new concepts) is a more appropriate instructional design.

When learners possess incomplete prior knowledge (or even complete to a subset of the language) and correct, introducing new elements to the Notional Machine (therefore making it more complete) is easy if we can keep the semantics sound (or as sound as possible). Our goal should be designing incremental Notional Machines. We should avoid “destroying” pre-existing mental models to accommodate new ones, since this change is rather difficult for learners. We should never say something such as “Hey, remember that behavior we discussed in a previous class (which by the way, even being incorrect was possibly coherent to students)? Forget about it, that’s not how it works, now I will show you how it truly works”.

Audience

Imagine how to communicate a single aspect of Python, variables storing values, to 3 distinct audiences: elementary school, high school, and CS1 students. The semantics are the same, but the level of detail and vocabulary could be vastly different. For example, to elementary school students, saying that a variable is a box that stores a value could be sufficient to convey the idea that variables hold values. We are using words and concepts that are commonplace for these students and the level of detail is very low. High school students are already familiarized with how a computer works, that it has limited memory (or even know that it has hard drives and RAM) and manipulation of data are done by accessing this memory. To this audience, we could say that variables are sections in the computer’s memory that store a single value and are manipulated using an alias to this value, the variable name. To a more specialized audience, such as CS1 students, we could say that variables are alias identifiers to a certain number of bytes, which are allocated on the stack if the variable has a primitive type, or in the heap if it is a reference type, holding one value.

The language and level of detail used by the Notional Machine will dramatically change based on the audience, but still conveying the same semantics.

Presentation

In my conceptualization, the Notional Machine is completely independent of its presentation.

More frequently, Notional Machines were presented implicitly, embedded in program visualizations. How much of the semantics and level of detail of the Notional Machine is possible to elicit from a visualization is still an open question. Let’s look into two different examples, PythonTutor and JSVEE. Both are visualizations to the same programming language, Python, with the same semantics but using different levels of abstraction to visualize this particular program (you can click the links and visualize it using the tools):

a = 2
b = 3 + 2
print(a + b)
  • PythonTutor evaluates a expression in a single step, no matter how many elements it has. It shows frames and objects and uses arrows to show references from variables to values.
  • JSVEE evaluates a expression step by step. It shows a stack and use boxes to represent references from variables to values.

But… do we have explicit representations of a Notional Machine? How do we communicate explicitly a Notional Machine? Using… textual representations? I know we have many semantic representations of languages (Shriram Krishnamurthi kindly linked me this great Python semantics by Joe Gibbs et al), and while such representations are elegant, complete and sound, they require a particular skill set to be comprehended that could not be accessible to a wider variety of audiences. But they are certainly an example of a Notional Machine! Aimed at Programming Language professionals, and perhaps using one of the most efficient ways to communicate it… if you can understand the jargon.

Ok, do we have something more friendly to other audiences? I was looking for textual representations or examples of Notional Machines explicitly represented. To my surprise, I found … only one!

An example of a Python Notional Machine

Thanks to Greg Wilson, who provided his vision of a Notional Machine for Python, we had something concrete to evaluate at Dagstuhl. The goal of our breakout group was to create Notional Machines for Python and Scratch, but we quickly discovered that we did not know Scratch semantics enough, nor was easy to elicit its semantics by running program examples.

We started analyzing Greg’s Notional Machine and listing what was specific to Python, what could be considered “generic” and what could also be applied to Scratch. We found that many things could be considered generic, as this Notional Machine tries to present aspects of how imperative programs are executed. While it succeeds in many aspects, the Notional Machine could tell more about the semantics of Python itself, the vocabulary used is not always consistent and the level of detail changes in many of its parts.

My Python (or a very small subset of it) Notional Machine

In our breakout group, our goal was not to present a better Notional Machine, or an improvement to Greg’s Notional Machine, but instead focus on the intention and process behind its design:

  • We are following a semantics formal representation. At that time we were not aware of Joe Gibbs Python semantics, but we were deeply influenced by Joe’s masterclass of semantics a day before. Our process more or less follows the formal definition of the language semantics, from the simplest Notional Machine that represents the simplest element in the language, and later introducing new elements to increase its completeness, much like the semantics introduce new production rules.
  • We have an audience. In this case, we aimed this Notional Machine to CS1 students. This audience will define the available vocabulary, what subset of the language will be presented, and set the goal for a level of detail and abstraction. We choose to cover literals, variables, expressions and assignment only, for simplicity reasons. We assumed a number of simplifications regarding types, values and many other characteristics of the language, again for simplicity reasons, but we plan to later extend this Notional Machine to be more complete. Given the audience, we are not concerned with details of memory management or pointer references, for example. Elements such as stacks, heaps and frames will be presented in later refinements of the Notional Machine as we introduce new elements of the language. Therefore, we propose a progression of Notional Machines that increase the completeness and detail level of the Notional Machine until we reach a saturation point, which is defined by our goal.
  • We have a presentation format. Our goal is to communicate the Notional Machine explicitly to our audience. Is an open question if using explicit textual representations would be a more efficient or effective way to communicate the Notional Machine as compared to visualizations, for example. We would like to not resort to metaphors or analogies, but instead use phrasal descriptions of elements and actions over these elements.

Notional Machine #1

  1. Programs can store and manipulate integer values.
  2. Values can be manipulated directly by the program or be stored in program’s memory.
  3. Variables are named aliases for sections of the program’s memory. Variables can hold only one value at a time.
  4. Variable names are unique in the program and should begin with a letter or an underscore, followed by any number of letters, numbers or underscore. Some names cannot be used as variable names because they are reserved to language functionalities.
  5. The assignment statement is the mechanism by which variables can haver their value changed. We use the operator = to tell the program that a variable on the left of the = operator will have its value changed to the value on the right side of the = operator.
  6. A variable is created when we assign a value to it using the correct rules of variable naming an assignment.

This 8 simple rules define our simplest Notional Machine. I again stress that this Notional Machine is not complete nor sound, but instead was designed to communicate the semantics of a very simple Python program, such as a = 2. In this Notional Machine we are not even allowing the execution of multiple statements, different variable types and many other things. It is enough to explain a very limited subset of the language, introducing its most basic elements : literals, variables and assignments.

Next we will augment this Notional Machine to introduce other useful semantic elements.

Notional Machine #2

  1. Programs can store and manipulate integer values.
  2. Values can be manipulated directly by the program or be stored in program’s memory.
  3. Variables are named aliases for sections of the program’s memory. Variables can hold only one value at a time.
  4. Variable names are unique in the program and should begin with a letter or an underscore, followed by any number of letters, numbers or underscore. Some names cannot be used as variable names because they are reserved to language functionalities.
  5. The assignment statement is the mechanism by which variables can haver their value changed. We use the operator = to tell the program that a variable on the left of the = operator will have its value changed to the value on the right side of the = operator.
  6. A variable is created when we assign a value to it using the correct rules of variable naming an assignment.
  7. An assignment is a valid and atomic statement.
  8. A Python program consists of 0 or more lines of code written following a specific set of rules. The rules will tell if each line is valid or not. Each valid line constitutes and statement that will instruct the program to execute an action. Next, the program looks to the next instruction in the next line until no more lines are left to check. Statements are executed from top to bottom.

Now we introduced 2 more rules to accept the of execution of multiple statements, the order of statements execution and what are the valid statements. We simplified this subset of Python to accept only one statement per line. With this Notional Machine, we can evaluate programs such as:

a = 2
b = 3

Notional Machine #3

  1. Programs can store and manipulate integer values.
  2. Values can be manipulated directly by the program or be stored in program’s memory.
  3. Variables are named aliases for sections of the program’s memory. Variables can hold only one value at a time are evaluated to the value they hold.
  4. Variable names are unique in the program and should begin with a letter or an underscore, followed by any number of letters, numbers or underscore. Some names cannot be used as variable names because they are reserved to language functionalities.
  5. The assignment statement is the mechanism by which variables can have their value changed. We use the operator = to tell the program that a variable on the left of the = operator will have its value changed to the value on the right side of the = operator.
  6. A variable is created when we assign a value to it using the correct rules of variable naming an assignment.
  7. An assignment is a valid and atomic statement.
  8. A Python program consists of 0 or more lines of code written following a specific set of rules. The rules will tell if each line is valid or not. Each valid line constitutes and statement that will instruct the program to execute an action. Next, the program looks to the next instruction in the next line until no more lines are left to check. Statements are executed from top to bottom.
  9. An expression is a piece of code that evaluates to a value.

Here we introduce expressions as anything that evaluates to a value. So, a literal is an expression in the same way that variables are also expressions since they are evaluated to a value. Now we can evaluate programs such as:

a = 2
b = a

Notional Machine #4

  1. Programs can store and manipulate integer values.
  2. Values can be manipulated directly by the program or be stored in program’s memory.
  3. Variables are named aliases for sections of the program’s memory. Variables can hold only one value at a time are evaluated to the value they hold.
  4. Variable names are unique in the program and should begin with a letter or an underscore, followed by any number of letters, numbers or underscore. Some names cannot be used as variable names because they are reserved to language functionalities.
  5. The assignment statement is the mechanism by which variables can haver their value changed. We use the operator = to tell the program that a variable on the left of the = operator will have its value changed to the value on the right side of the = operator.
  6. A variable is created when we assign a value to it using the correct rules of variable naming an assignment.
  7. An assignment is a valid and atomic statement.
  8. A Python program consists of 0 or more lines of code written following a specific set of rules. The rules will tell if each line is valid or not. Each valid line constitutes and statement that will instruct the program to execute an action. Next, the program looks to the next instruction in the next line until no more lines are left to check. Statements are executed from top to bottom.
  9. An expression is a piece of code that evaluates to a value.
  10. Arithmetic operators act over a pair of values and are evaluated to a value.

We introduced arithmetic operators and tied them to expressions by stating that the result of an operation is evaluated to a value, therefore also being an expression. With Notional Machine #4 we can evaluate a program such as:

a = 2
a = a + 1

Conclusion

The design of Notional Machines is still in its infancy. But I firmly believe that it is an important concept that should be matured in next years, hopefully with many research projects spawned from this Dagstuhl seminar. I will offer two concrete demands which I encountered recently to illustrate why explicit Notional Machines are important:

  1. My research involves quantifying the cognitive complexity of comprehending code. One key aspect of such quantification involves describing the cognitive actions students need to perform while comprehending elements of the code. A Notional Machine can be used to quantify the number of actions and how these actions interact with each other. Different levels of abstraction and detail could lead to completely different relationships and the number of cognitive actions, therefore changing the complexity of the code.
  2. My institution in Brazil is a peculiar place. We offer an undergraduate CS degree , but we also offer an integrated vocational CS high school degree. Integrated means that students have everything they would have in a regular high school course plus a CS course (they have more or less the same courses they would have at undergraduate level, but with a smaller scope in a 3 years degree) . My institution hire lecturers using a process similar to any other university: we value previous research conducted by the candidate, having a doctoral degree, previous projects and so on. Most of us taught for years in undergraduate courses before joining my institution. What is the problem here? What happens when a newly hired lecturer, coming from an undergraduate context, suddenly faces the challenge of teaching CS to teenagers? You can guess. How do we communicate to other teachers how to present the semantics of the language in a way that is suitable to this particular audience? I faced this question last month as we changed the base programming language of the vocational degree to JavaScript. I still don’t have many answers to the problems that such change impose, since the Notional Machine of JavaScript possibly needs to explain things(networks, web development tools, frameworks) that the previous Java Notional Machine did not need.

While the Notional Machine we developed at Dagstuhl is very simple and only cover a small subset of the Python language, the process behind its design it’s quite insightful and hopefully useful for instructors and researchers designing their Notional Machine. Needless to say, there is no single Notional Machine for Python, but many incarnations of a Notional Machine. Given this multitude of possibilities, that’s why I believe it is crucial to discuss the intent of a Notional Machine and not only the final product.

Is this kind of Notional Machine useful? Maybe. It is still an open question how much such Notional Machine can impact how learners develop their mental models and if it is possible to transform incorrect mental models by exposing students to a concrete Notional Machine. It would be interesting to check how much of students’ mental models match the semantics described by a Notional Machine. If we observe a strong correlation between the presented concrete Notional Machine and students’ mental models, will be interesting to investigate Notional Machines that could lead to fewer misconceptions.

Design a site like this with WordPress.com
Get started