The previous post explains Gatsby initialization until load plugins
. Next is onPreInit
step.
( 3 / 7 ) onPreInit #
As the offical document says, onPreInit
API runs as soon as plugins are loaded. The function apiRunnerNode handles Gatsby Node APIs like onPreInit
.
step | summary | core function |
---|---|---|
1 | call onPreInit API implemented in gatsby-node.js | apiRunnerNode |
When apiRunnerNode
receives the string onPreInit
as an argument, it finds onPreInit
API in gatsby-node.js
of the loaded plugins and calls it.
Judgement of whether to read more in detail or skip it #
In this step, I don't know a concrete process which onPreInit
API runs because users or plugins implement it in gatsby-node.js
. If read more about the code of apiRunnerNode
, I thought there was no more information about the onPreInit
step. It was enough at the time if I understood that apiRunnerNode
handles Gatsby Node API.
When read code, it is sometimes important to focus on whether the code is closely related to the topic we want to know. If not so much, we had better skip reading in detail. Or we often get lost because of too much information that we don't need immediately.
( 4 / 7 ) delete html and css files from previous builds #
Gatsby deletes html and css files from previous builds. There is no core function. If the command line interface gatsby develop
calls the function initialize
, this step is skipped.
This step and next are so-called initialization I imagined before reading.
( 5 / 7 ) initialize cache #
After deletes html and css files, the function initialize
checks and deteles cache. And then creates empty directories. There is no core function. Instead, this step can be divided into 4 processes below.
process | summary | core function |
---|---|---|
1 | create a hash to check updated plugins | (createHash) |
2 | check a new hash against the hash stored in Redux | - |
3 | delete .cache directory and files in it | - |
4 | create an empty .cache directory and public/static directory | (ensureDir) |
The function createHash
is a method in the Node.js module crypto.
It is impressive how Gatsby checks updated plugins before deletes cache. The first process creates a hash of all the version numbers of installed plugins, the site's package.json
, gatsby-config.js
and gatsby-node.js
. The second process checks the new hash against the old hash stored in Redux.
In other words, Gatsby deletes cache in initialization if
- any plugins are updated.
- any files of
package.json
,gatsby-config.js
andgatsby-node.js
are modified.
( 6 / 7 ) copy gatsby files #
This step has 3 processes. Many files and directories in the package gatsby
is copied into .cache directory of a user's Gatsby project. The rest of the processes prepares for loading Gatsby SSR API and Gatsby Browser API.
process | summary | core function |
---|---|---|
1 | copy files in the package gatsby into .cache directory of a user's Gatsby project |
- |
2 | create subdirectories of .cache | - |
3 | create files to load gatsby-ssr.js and gatsby-browser.js in each plugin |
- |
The copied files locate on /cache-dir/
directory. There are 30+ files and 4 directories. For example, default-html.js
is below.
import React from "react"
import PropTypes from "prop-types"
export default function HTML(props) {
return (
<html {...props.htmlAttributes}>
<head>
<meta charSet="utf-8" />
<meta httpEquiv="x-ua-compatible" content="ie=edge" />
<meta
name="viewport"
content="width=device-width, initial-scale=1, shrink-to-fit=no"
/>
{props.headComponents}
</head>
<body {...props.bodyAttributes}>
{props.preBodyComponents}
<div
key={`body`}
id="___gatsby"
dangerouslySetInnerHTML={ __html: props.body }
/>
{props.postBodyComponents}
</body>
</html>
)
}
HTML.propTypes = {
htmlAttributes: PropTypes.object,
headComponents: PropTypes.array,
bodyAttributes: PropTypes.object,
preBodyComponents: PropTypes.array,
body: PropTypes.string,
postBodyComponents: PropTypes.array,
}
After copy, detect the plugins which have gatsby-ssr.js
and/or gatsby-browser.js
. Each plugin creates api-runner-ssr.js
and/or api-runner-browser-plugins.js
to load Gatsby SSR API and Gatsby Browser API. And then all the paths to these files are written in api-runner-ssr.js
and api-runner-browser-plugins.js
of a user's Gatsby project.
It is more complicated than handling Gatsby Node API. I don't understand the reason. But I will realize it if continue to read. I guess at this time that this complicated process for Gatsby SSR API and Browser API is due to their called phase like after-build or in browser.
( 7 / 7 ) onPreBootstrap #
As the same as onPreInit
step(3/7), apiRunnerNode
searches onPreBootstrap
API in plugins and calls them.
return value of the function initialize #
The return value is simple below. Both store
and workerPool
. store in Redux holds the whole state tree during build. workerPool
is a jest-worker object.
const workerPool = WorkerPool.create()
return {
store,
workerPool,
}
Summary of the function initialize #
At the beginning of my Gatsby code reading project, I've read the function initialize
in detail. initialize
is called by the function bootstrap
. In building static sites, the sequence in which Gatsby calls initialize
is below.
gatsby build
-> /src/commands/build.ts : build
-> /src/bootstrap/index.ts : bootstrap
-> /src/services/initialize.ts : initialize
The function initialize
is
- open and validate
gatsby-config.js
files - load plugins
- delete html and css files
- delete cache
- copy gatsby files
- call Gatsby Node API(
onPreInit
andonPreBootstrap
)
These processes account for one third of the bootstrap process during build.
Summary of code reading #
Finally I read more than 2,800 lines of code to understand the function initialize
, though initialize
has just 540 lines.
I found a lot of challenges to code reading through this experience. And the tips for code reading can be updated.
Knowledge often helps us understand code #
The knowledge or background supported me to understand the code well. The point of this tip is how early we can notice the lack of the knowledge behind the code. I recommend to search official documents just after you don't understand code. You can find a detail explanation or a release note in an official site.
Panes can let us read better the code over multiple files. Not tabs. #
Our memory has a limit to retain functions, variables and types at the same time. To read better the code over multiple files, a tool like a text editor should support our memory to show multiple parts of code on a screen. In my opinion, panes can do it, not tabs.
Focus on whether the code is closely related to the topic we want to know #
It is sometimes important to focus on whether the code is closely related to the topic we want to know. If not so much, we had better skip reading in detail. Or we often get lost because of too much information that we don't need immediately.
Making lists of functions prevents me from getting confused by too much information of code. #
Making lists of functions is one of my tips about code reading. I had to arrange much information to add explanations to lists of functions in the course of code reading. The list below shows the core functions which loadPlugins
includes.
process | summary | core function |
---|---|---|
1 | create an array including all the plugins | normalizeConfig |
2 | validate options of all the plugins | validateConfigPluginsOptions |
3 | load browser API, node API, SSR API from files | getAPI |
4 | load plugins in gatsby-config, internal plugins and other plugins | loadPluginsInternal |
5 | flatten plugins with nests into an array | flattenPlugins |
6 | identify which APIs each plugin exports | collatePluginAPIs |
7 | distinguish types of bad exports API and report errors | handleBadExports |
8 | detect multiple replaceRenderers and report warnings | handleMultipleReplaceRenderers |
As a result, I can understand the sequence of called functions and the data type the functions create, transform or just delete. This routine work that I makes lists of functions lightens the burden on me when understanding code. And I can keep a clear head due without getting confused by too much information of code.
Scanning the whole code of a function or a file helps me divide a difficulty into many parts. #
I tried to read from the beginning of the function initialize
line by line. But soon I got confused and overwhelmed because I couldn't find the ends of this function and therefore didn't understand what every part of the code means. For the countermeasure against confusion, I scanned the whole code of initialize
some times. This approach can find parts I understand easily rather than parts I don't understand. The more parts of code I understand, the better I can focus on other parts.
Additionally, scanning the whole code can find some patterns in the code. In the case of initialize
, I found activityTimer
was called repeatedly. It divided the code into 7 parts. It was a great discovery for me. I could read the code of initialize
part by part and understand it easier than before dividing.
What's Next #
The next phase is sourcing data, type inference and building GraphQL schema. My code reading project is approaching the core of Gatsby which is a React-based static site generator with GraphQL. Let's go!