Background and Motivation
Our team originally built the openuilang parser in Rust and compiled it to WASM to achieve near‑native speed in browsers. The performance of Rust satisfied the core parsing logic, but the surrounding pipeline still required careful measurement. When we later rewrote the same logic in TypeScript we observed a 3x speed increase in end‑to‑end latency.
The parser transforms a custom DSL generated by an LLM into a React component tree, a step that runs on every streaming chunk. Because each chunk triggers the full pipeline, any latency directly impacts user experience. This context motivated us to audit the entire data flow for hidden overhead.
Pipeline Architecture
The pipeline consists of six distinct stages: autocloser, lexer, splitter, parser, resolver, and mapper. Each stage produces an intermediate representation that feeds the next, creating a deterministic flow. Understanding the data handed between stages is essential for pinpointing bottlenecks.
Autocloser patches partial input by appending minimal closing brackets or quotes, ensuring syntactic validity. The lexer performs a single‑pass scan, emitting typed tokens that the splitter groups into identifiers, expressions, and statements. The parser builds an abstract syntax tree, while the resolver resolves variable references and detects circular dependencies. Finally, the mapper converts the internal AST into the public OutputNode format consumed by React.
WASM Boundary Overhead
Every invocation of the WASM module incurs a fixed cost that dwarfs the raw Rust execution time. The process begins with copy of the input string from the JavaScript heap into WASM linear memory, followed by a memcpy operation. After parsing, Rust serializes the result to a JSON string, which is then copied back to the JavaScript heap and deserialized by V8.
This serialization round‑trip introduces a measurable overhead that dominates the overall latency. The cost is independent of the parsers internal efficiency, meaning that even a perfectly optimized Rust core cannot escape the boundary penalty.
Attempted Optimization: Direct JsValue Return
To eliminate the JSON step we integrated serde‑wasm‑bindgen, which converts Rust structs directly into JsValue objects. The expectation was a reduction in copy and serialization work, thereby improving overall speed. In practice the direct return was about 30% slower than the JSON pathway.
The slowdown stemmed from the runtime cost of constructing a JsValue for each field and the additional memory management required by the binding layer. This result reinforced the insight that the boundary cost is not solely about JSON, but about any data crossing between the two runtimes.
Outcome and Lessons Learned
Rewriting the parser in TypeScript removed the entire WASM boundary, delivering a 3x improvement in end‑to‑end latency. The experiment demonstrated that for highly interactive, streaming workloads, the overhead of crossing the language boundary can outweigh raw execution speed. It also highlighted the importance of profiling at the system level, not just within the core algorithm.
Future work may explore hybrid approaches, such as keeping computationally heavy stages in Rust while exposing only lightweight results to JavaScript. However, any such design must rigorously measure the cost of data marshaling to avoid hidden performance traps.
Future Directions
Potential avenues include profiling each stages memory allocation pattern, experimenting with binary serialization formats, and benchmarking alternative binding tools. By quantifying the exact cost of each boundary operation we can make informed decisions about where Rust truly adds value. The ultimate goal remains a responsive, low‑latency parser that scales with the demands of real‑time LLM output.