In computer science there is a term I always wanted to know deep about it and it is called AST(Abstract Syntax Tree). Without knowing the AST we can’t understand the architecture of a Compiler. So let’s explore this.
Abstract Syntax Tree
AST(Abstract Syntax Tree) or Syntax Tree is a tree representation of our source code written in any programming language. We will get a structure of our source code, so that it will become easy to generate Machine Code.
When we write a code generally a compiler takes our that original source code and pass it to Parser. A parser takes the source code and parse it into small units each of them called Token.
let a = 10
In this example parser parse it to let
, a
, =
and 10
. We can call them as,
let
=> Keyword/Kind.
a
=> Identifier.
=
=> Assignment.
10
=> Literal.
As we have the tokens available we can make an AST.
For let a = 10
let’s explain this Abstract Syntax Tree.
Here each of the block is called a node, starting from Program which declares our program started then Body which consists every node.
VariableDeclaration, consists everything(type, assignable value) of variable a
. It splits into two parts Kind and Declarations. Kind consists the type of a
(so let
in our case), then Declarations contains VariableDeclarator which splits into two nodes ID & Init. ID contains the Identifier in our case Identifier is a
and Init contains Literal which is 10.
// JSON format of our AST
{
"type": "Program",
"body": [
{
"type": "VariableDeclaration",
"declarations": [
{
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "a"
},
"init": {
"type": "Literal",
"value": 10
}
}
],
"kind": "let"
}
]
}
Our Parser always creates scope if you have nested scope in your code for a specific node the Parser can make relation between it’s parent scope to child scope, will write in another blog.
This is all about AST. Now let’s code and see an example of it.
Example
We will remove every log
statement from our code by manipulating AST. We will make it through Babel.
// add this to scripts in package.json
"build": "babel app.js -d lib --plugins ./plugin.js"
// app.js
let a = 10
console.log(a)
We have a variable a
and also a console statement. We don’t want to ship the console statement in the production, so we are going to remove the console statement by manipulating the AST.
// plugin.js
// removing console statement from code
module.exports = function() {
return {
visitor: {
ExpressionStatement(path) {
path.remove();
}
}
};
};
In the above code we are exporting a function which returns an object. In this object we have a visitor
key, the visitor does the traversing task and whenever it finds a node(in our code ExpressionStatement) we can start manipulating. Each node can have a path
associated in it’s argument. The path
contains every node for the expression. If we write path.remove()
, it removes everything of that expression.
Run npm run build
and see your build file generated in the lib directory. You can see there is no console
statement in your build file.
For visual representation of AST, JointJS
For manipulating AST, ASTExplorer