Executing generation parsers

When it comes to generate code from your SDTF client with the SDK, you can leverage Specify's built-in parsers and / or add your own custom implementation.

The SDK provides two main methods to face either the need of the flexibility or increased performance.

By default, the generation is run locally utilizing the host machine resources. To meet some environment limitation, any Specify's built-in parser can be executed remotely, on Specify servers.

Create parsers pipelines

With your SDTF client, you can create as many parsers pipelines as you need to generate your outputs.

import { parsers } from "@specifyapp/sdk";

const executePipelines = sdtfClient.createParsersPipelines(
  parsers.toTailwind({ output: { type: "file", filePath: "./tailwind.theme.js" } })
);

The executePipelines is an asynchronous function that will actually execute the generation. Hence:

const results = await executePipelines();

results.debug();

The results is a ParsersEngineResults instance, which comes with its own set of helper methods to work with the outputs and messages issued over the generation.

Note that results is plural since it can accumulate more than one parsers pipeline for a single execution.

Write the outputs to the file system

While being executed, the parsers engine produces outputs that gets returned within the ParsersEngineResults instance which comes with few helper methods like writeToDisk

// ...
const results = await executePipelines();

const report = await results.writeToDisk('./public');

writeToDisk takes an optional base path and returns a promise containing a report of the written files and errors if any.

Run many concurrent parsers pipelines

All parsers pipelines passed to createParsersPipelines are run concurrently out of the box. It means, you can write the following:

import { parsers } from "@specifyapp/sdk";

const results = await sdtfClient.createParsersPipelines(
  parsers.toTailwind({ output: { type: "file", filePath: "./tailwind.theme.js" } }),
  parsers.toJsonList({ output: { type: 'file', filePath: 'tokens-list.json' } })
)()

And have both pipelines executed concurrently from the same initial token tree.

Chain parsers to run specific pre-generation

In some cases, you need to chain the parser functions to act like A -> B -> C where you are only interested in C. For that matter, you can leverage the chainParserFunctions util.

import { parsers, chainParserFunctions } from "@specifyapp/sdk";

const results = await sdtfClient.createParsersPipelines(
  chainParserFunctions(
    svgo({ options: { svgo: { ... }}}),
    svgToJsx({ output: { type: 'directory', directoryPath: 'icons' } })
  )
)()

In this example, we want to optimize the content of the vector tokens with SVGO, and then, generate JSX components.

Execute a built-in parser function remotely

In some cases, you might need to deal with few host machine resources (like in many CI). To help with this, any Specify's built-in generation parser can be executed remotely by passing the shouldExecuteRemotely: true option.

import { parsers, chainParserFunctions } from "@specifyapp/sdk";

const results = await sdtfClient.createParsersPipelines(
  svgo({
    options: {
      shouldExecuteRemotely: true,
      svgo: { ... }
    },
    output: { type: 'directory', directoryPath: 'icons' }
  }),
)()

Doing so, the SVGO process will be run on Specify's servers and the results will be returned to the SDK to be further processed / written on disk.

Create pipelines from parser rules configuration

A Parser rule is a JSON object representing a parsers pipeline where all parsers will be run sequencially. Rules configuration are primarily utilized within the configuration file for the CLI or GitHub.

With the SDK, the use of parser rules configuration reduces the interoperability with custom code, but can significantly increase the speed of a remote execution.

In order to build parsers pipelines from the SDTF client we need to call the createParsersPipelinesFromRules method.

const executePipelines = await sdtfClient.createParsersPipelinesFromRules({
  name: 'Icons to JSX',
  parsers: [
    {
      name: 'svgo',
      options: { svgo: { ... } },
    },
    {
      name: 'svg-to-jsx',
      output: { type: 'directory', directoryPath: 'icons' },
    },
  ],
});

Doing so, it creates an async parsers engine executor in the exact same manner it did for parser functions.

Run faster remote executions

Any built-in generation parser holds a shouldExecuteRemotely: boolean option to treat its inner execution as remote.

Yet, rules configuration also implement that option, allowing the SDK to collect all the remote rules and parsers, then sends out a single HTTP request for the whole execution.

const executePipelines = await sdtfClient.createParsersPipelinesFromRules({
  name: 'Icons to JSX',
  shouldExecuteRemotely: true,
  parsers: [ ... ],
});

Create your custom parser function

If the parsers that Specify is providing are not enough for your use case, you can create your own parser function!

Now that we are able to execute parsers locally, it means that parsers are simple functions, so creating a custom parser is only about writing a function. But before creating your own parser, you have to understand how a parser is working.

The next part will describe how parsers are working, but you'll quickly notice that it doesn't looks like the parsers above, e.g: an output option and a parser option. The reason for it is that all our parsers are actually functions that return a parser function. So don't worry if it doesn't looks like above, in the end it's all the same thing

The anatomy of a parser

A parser is a function that will take 3 parameters:

  1. An input will be one of the type of ParsersEngineDataBox :

    1. SDTFEngineDataBox: { type: 'SDTF Engine'; engine: SDTFEngine; }

    2. SDTFDataBox: { type: 'SDTF'; graph: SpecifyDesignTokenFormat; }

    3. JSONDataBox: { type: 'JSON'; json: Record<string, unknown>; }

    4. SVGDataBox: { type: 'SVG'; svg: Array<{ ... }> }

    5. UrlDataBox: { type: 'urls'; files: Array<{...}> }

    6. BitmapDataBox: { type: 'bitmap'; files: Array<{...}> }

    7. CustomDataBox: { type: 'custom'; custom: unknown }

  2. The ParserToolbox, which helps accumulate the output that will be written to your file system

An important thing to understand is that a parser has 2 outputs:

  1. The return type of the function, that can be passed to another parser if chained

  2. The output that you want to write to the file system (files, text, JSON, SDTF), and that'll be accumulated into the ParserToolbox

There's actually 2 reasons for this choice:

  1. There's only 1 return value, but you can append as much output as you want to an accumulator

  2. We need a difference between the output of a parser, and what we want to send to the next parser

Let's have a look to the output in itself.

The parser output

First, let's focus on the return. The output will be the input of the next parser if you use it inside a ParserChainer. So as the return is the input of the next parser, you probably guessed it: it's the same one for the input, which means one of ParsersEngineDataBox :

  • SDTFEngineDataBox: { type: 'SDTF Engine'; engine: SDTFEngine; }

  • SDTFDataBox: { type: 'SDTF'; graph: SpecifyDesignTokenFormat; }

  • JSONDataBox: { type: 'JSON'; json: Record<string, unknown>; }

  • SVGDataBox: { type: 'SVG'; svg: Array<{ ... }> }

  • BitmapDataBox: { type: 'bitmap'; files: Array<{...}> }

  • UrlDataBox: { type: 'urls'; files: Array<{...}> }

  • CustomDataBox: { type: 'custom'; custom: unknown }

Now, let's see how we can output files. To do so, we need to push into the outputsAccumulator one of the types of the ParserOutput:

  • TextOutput: { type: 'text'; text: string }

  • SDTFOutput: { type: 'SDTF'; graph: SpecifyDesignTokenFormat }

  • JSONOutput: { type: 'JSON'; graph: string }

  • FilesOutput: { type: 'files'; files: Array<{ path: string; content: { type: 'text'; text: string; } | { type: 'url'; url: string; }}> }

Most of the parsers take an SDTFDataBox as an input, and return it as the output as they don't modify anything and only output some files. So if you're not sure about what to return, just return the input.

So now that we know what is a parser, let's have a look to an example of parser that create a file with all the token's name:

import { SpecifyDesignTokenFormat } from '@specifyapp/specify-design-token-format'
import { ParserToolbox, SDTFEngineDataBox } from '@specifyapp/sdk/bulk'

function parserName(
  input: SDTFEngineDataBox,
  toolbox: ParserToolbox,
) {
  const engine = input.engine;
  const names = engine
    .query
    .getAllTokenStates()
    .map(tokenState => tokenState.name);
  
 toolbox.populateOutput(
   {
     type: 'files',
     files: [{ path: 'names.txt', content: { type: 'text', text: names.join('\n') } }] 
   }
 )
 
 return input
}

Let's break down the example:

  1. We use the engine to get all the names

const names = tokenTreeClient
  .engine
  .query
  .getAllTokenStates()
  .map(tokenState => tokenState.name);
  1. We populate the output into the accumulator

toolbox.populateOutput(
  {
    type: 'files',
     files: [{ path: 'names.txt', content: { type: 'text', text: names.join('\n') } }] 
  }
)
  1. Finally, we return the input as we didn't modify anything and don't need to return something else

return input

Now that we have our custom parser, we can use it freely in the ParserPipeline or ParserChainer.

Last updated