带你揭开神秘的Javascript AST面纱之Babel AST 四件套的使用方法

还是社会实践 发表于 2023-4-12 20:14:21

作者：京东零售周明亮
写在前面

这里我们初步提到了一些基础概念和应用：

[*]分析器
[*]抽象语法树 AST
[*]AST 在 JS 中的用途
[*]AST 的应用实践
有了初步的认识，还有常规的代码改造应用实践，现在我们来详细说说使用 AST，如何进行代码改造？
Babel AST 四件套的使用方法

其实在解析 AST 这个工具上，有很多可以使用，上文我们已经提到过了。对于 JS 的 AST 大家已经形成了统一的规范命名，唯一不同的可能是，不同工具提供的详细程度不一样，有的可能会额外提供额外方法或者属性。
所以，在选择工具上，大家按照各自喜欢选择即可，这里我们选择了babel这个老朋友。
初识 Babel

我相信在这个前端框架频出的时代，应该都知道babel的存在。如果你还没听说过babel，那么我们通过它的相关文档，继续深入学习一下。
因为，它在任何框架里面，我们都能看到它的影子。

[*]Babel JS 官网
[*]Babel JS Github
作为使用最广泛的 JS 编译器，他可以用于将采用 ECMAScript 2015+ 语法编写的代码转换为向后兼容的 JavaScript 语法，以便能够运行在当前和旧版本的浏览器或其他环境中。
而它能够做到向下兼容或者代码转换，就是基于代码解析和改造。接下来，我们来说说：如何使用@babel/core里面的核心四件套：@babel/parser、@babel/traverse、@babel/types及@babel/generator。
1. @babel/parser

@babel/parser 核心代码解析器，通过它进行词法分析及语法分析过程，最终转换为我们提到的 AST 形式。
假设我们需要读取React中index.tsx文件中代码内容，我们可以使用如下代码：
const { parse } = require("@babel/parser")

// 读取文件内容
const fileBuffer = fs.readFileSync('./code/app/index.tsx', 'utf8');
// 转换字节 Buffer
const fileCode = fileBuffer.toString();
// 解析内容转换为 AST 对象
const codeAST = parse(fileCode, {
// parse in strict mode and allow module declarations
sourceType: "module",
plugins: [
// enable jsx and typescript syntax
"jsx",
"typescript",
],
});当然我不仅仅只读取React代码，我们甚至可以读取Vue语法。它也有对应的语法分析器，比如：@vue/compiler-dom。
此外，通过不同的参数传入 options，我们可以解析各种各样的代码。如果，我们只是读取普通的.js文件，我们可以不使用任何插件属性即可。
const codeAST = parse(fileCode, {
// parse in strict mode and allow module declarations
sourceType: "module"
});通过上述的代码转换，我们就可以得到一个标准的 AST 对象。在上一篇文章中，已做详细分析，在这里不在展开。比如：
// 原代码
const me = "我"
function write() {
console.log("文章")
}

// 转换后的 AST 对象
const codeAST = {
"type": "File",
"errors": [],
"program": {
"type": "Program",
"sourceType": "module",
"interpreter": null,
"body": [
   {
   "type": "VariableDeclaration",
   "declarations": [
      {
         "type": "VariableDeclarator",
         "id": {
         "type": "Identifier",
         "name": "me"
         },
         "init": {
         "type": "StringLiteral",
         "extra": {
            "rawValue": "我",
            "raw": "\"我\""
         },
         "value": "我"
         }
      }
   ],
   "kind": "const"
   },
   {
   "type": "FunctionDeclaration",
   "id": {
      "type": "Identifier",
      "name": "write"
   },
   "generator": false,
   "async": false,
   "params": [],
   "body": {
      "type": "BlockStatement",
      "body": [
         {
         "type": "ExpressionStatement",
         "expression": {
            "type": "CallExpression",
            "callee": {
               "type": "MemberExpression",
               "object": {
               "type": "Identifier",
               "computed": false,
               "property": {
                  "type": "Identifier",
                  "name": "log"
               }
               },
               "arguments": [
               {
                  "type": "StringLiteral",
                  "extra": {
                     "rawValue": "文章",
                     "raw": "\"文章\""
                  },
                  "value": "文章"
               }
               ]
            }
         }
         }
      ]
   }
   }
]
}
}2. @babel/traverse

当我们拿到一个标准的 AST 对象后，我们要操作它，那肯定是需要进行树结构遍历。这时候，我们就会用到 @babel/traverse 。
比如我们得到 AST 后，我们可以进行遍历操作：
const { default: traverse } = require('@babel/traverse');

// 进入结点
const onEnter = pt => {
// 进入当前结点操作
console.log(pt)
}
// 退出结点
const onExit = pe => {
// 退出当前结点操作
}
traverse(codeAST, { enter: onEnter, exit: onExit })那么我们访问的第一个结点，打印出pt的值，是怎样的呢？
// 已省略部分无效值
<ref *1> NodePath {
contexts: [
TraversalContext {
   queue: ,
   priorityQueue: [],
   ...
}
],
state: undefined,
opts: {
enter: [ ],
exit: [ ],
_exploded: true,
_verified: true
},
_traverseFlags: 0,
skipKeys: null,
parentPath: null,
container: Node {
type: 'File',
errors: [],
program: Node {
   type: 'Program',
   sourceType: 'module',
   interpreter: null,
   body: ,
   directives: []
},
comments: []
},
listKey: undefined,
key: 'program',
node: Node {
type: 'Program',
sourceType: 'module',
interpreter: null,
body: [ , ],
directives: []
},
type: 'Program',
parent: Node {
type: 'File',
errors: [],
program: Node {
   type: 'Program',
   sourceType: 'module',
   interpreter: null,
   body: ,
   directives: []
},
comments: []
},
hub: undefined,
data: null,
context: TraversalContext {
queue: [ ],
priorityQueue: [],
...
},
scope: Scope {
uid: 0,
path: ,
block: Node {
   type: 'Program',
   sourceType: 'module',
   interpreter: null,
   body: ,
   directives: []
},
...
}
}是不是发现，这一个遍历怎么这么多东西？太长了，那么我们进行省略，只看关键部分：
// 第1次
<ref *1> NodePath {
listKey: undefined,
key: 'program',
node: Node {
type: 'Program',
sourceType: 'module',
interpreter: null,
body: [ , ],
directives: []
},
type: 'Program',
}我们可以看出是直接进入到了程序program结点。对应的 AST 结点信息：
program: {
type: 'Program',
sourceType: 'module',
interpreter: null,
body: [


],
},接下来，我们继续打印输出的结点信息，我们可以看出它访问的是program.body结点。
// 第2次
<ref *2> NodePath {
listKey: 'body',
key: 0,
node: Node {
type: 'VariableDeclaration',
declarations: [ ],
kind: 'const'
},
type: 'VariableDeclaration',
}

// 第3次
<ref *1> NodePath {
listKey: 'declarations',
key: 0,
node: Node {
type: 'VariableDeclarator',
id: Node {
   type: 'Identifier',
   name: 'me'
},
init: Node {
   type: 'StringLiteral',
   extra: ,
   value: '我'
}
},
type: 'VariableDeclarator',
}

// 第4次
<ref *1> NodePath {
listKey: undefined,
key: 'id',
node: Node {
type: 'Identifier',
name: 'me'
},
type: 'Identifier',
}

// 第5次
<ref *1> NodePath {
listKey: undefined,
key: 'init',
node: Node {
type: 'StringLiteral',
extra: { rawValue: '我', raw: "'我'" },
value: '我'
},
type: 'StringLiteral',
}

[*]node当前结点
[*]parentPath父结点路径
[*]scope作用域
[*]parent父结点
[*]type当前结点类型
现在我们可以看出这个访问的规律了，他会一直找当前结点node属性，然后进行层层访问其内容，直到将 AST 的所有结点遍历完成。
这里一定要区分NodePath和Node两种类型，比如上面：pt是属于NodePath类型，pt.node才是Node类型。
其次，我们看到提供的方法除了进入还有退出方法，这也就意味着，每次遍历一次结点信息，也会退出当前结点。这样，我们就有两次机会获得所有的结点信息。
当我们遍历结束，如果找不到对应的结点信息，我们还可以进行额外的操作，进行代码结点补充操作。结点完整访问流程如下：
<ul>进入>Program<ul>
进入>node.body<ul>
进入>node.declarations<ul>
进入>node.id
退出node.init
退出

页: [1]

翼度科技's Archiver

带你揭开神秘的Javascript AST面纱之Babel AST 四件套的使用方法