Sugawara Yuuta
4 min readFeb 12, 2023

How I created the fastest JSON decoder in Go

Introduction

Hello, my name is Sugawara Yuuta. This time, I would like to introduce a JSON decoder that I thought up during a break in high school and built in my spare time.

As far as I know, it is the fastest decoder that has a generic type acceptance style (that means the same style as the standard package).

Motivation

When I started developing with the Go language, I was surprised to see how many third-party JSON decoders have been created (despite the Go language community’s style of doing everything with the standard library compared to JavaScript, etc.).

However, after trying them out, both large and small, I realized that each has a lot of problems that they don’t necessarily share. I will cover that in more detail in the section below.

Problems with the JSON decoders so far

I will not name specific libraries.

  • CPU-dependent (e.g., using assembly and instead of getting the possibility to be faster, non-amd64 is no longer supported. For example, M1 and Raspberry Pi are arm64)
  • Not user-friendly (incompatible with standard packages, complicated or unintuitive to use, etc.)
  • Preparation is required (users need to write switch-cases or use code generation, which adds an extra step and complicates the development workflow).
  • Not well maintained (new language versions are not supported, relies on EOL packages, etc.)
  • Not fast after all! (Heavy use of reflection, weak support for io.Reader, etc.)

Aim of future JSON decoders

  • CPU dependency → reproduce with Pure Go. Even if it is slower than something using JIT or SIMD in assembly, I personally think it is worth it.
  • Not user-friendly / Preparation is required → The standard Go language package is designed to be intuitive and straightforward to use, so just compatibility is achieved!
  • Not fast after all! → While adhering to the above two points, we will try to develop with high speed always in mind.

To achieve this, we created sugawarayuuta/sonnet, which we introduce here.

sugawarayuuta/sonnet

https://github.com/sugawarayuuta/sonnet

Features

  • Almost compatible with standard package decoders (will be fully compatible in the future)
  • Fast. Roughly 5 times faster than the standard package, faster than json-iterator/go and goccy/go-json, and in some cases faster than bytedance/sonic (which claims to be the fastest).

Benchmarks

https://github.com/RichardHightower/json-parsers-benchmark/blob/master/data/citm_catalog.json

BenchmarkEncodingJson-4 56 19289518 ns/op 5850983 B/op 33049 allocs/op
BenchmarkSonnet-4 313 3477650 ns/op 905760 B/op 5469 allocs/op
BenchmarkGoccyGoJson-4 247 4855301 ns/op 2989978 B/op 14323 allocs/op
BenchmarkJsonIteratorGo-4 225 5163313 ns/op 1339385 B/op 34599 allocs/op
BenchmarkSegmentioEncoding-4 130 9156341 ns/op 5129332 B/op 3303 allocs/op
BenchmarkBytedanceSonic-4 274 4235117 ns/op 3848331 B/op 10505 allocs/op

The above. The benchmarks are the ones that come standard with Go

If speed is not a concern for your project, there is no problem, but the standard package encoding/json is quite slow.

If you are interested in it, we recommend you to measure it by yourself.

Decoder implementation

  • Create a map and store type-specific processing.

Go language maps have the feature that keys can be used if they are comparable.

Go pointers are comparable.

You can get the type dynamically by casting the value using unsafe.

The type you get is a pointer, called *rtype inside the `reflect` package, and if the type is the same, the address will be the same

Given the above, we can create a map where *rtype is key and function is value. This reduces the number of times reflection is used from the second time on.

  • The places where the map can be replaced should be slices or fixed-length arrays.

For example, if you want to use byte as a key, you can take advantage of the fact that the maximum value is only 256 and make a fixed-length array and substitute it such as [256]bool

  • In some cases, create your own map.

In my case, I used the Robin Hood hashing method to create a map that can be set with string and get with []byte using the `hash/maphash` standard package.

Surprisingly, it was faster in my environment than when I used xxhash, etc.

  • Pointer arithmetic

In Go 1.17, `unsafe.Add` was added to make it easier to do pointer operations.

I use it a lot in my package, for example, when I want to access a field in a structure, I can cast to structType to get the offset to it.

The same is true for arrays, and with unsafe.Add(pointer, array length*element type size) I can get a new pointer to add a new item to the array

  • go:linkname

You can use this by creating an alias on hand, for example if you have a feature that is not exported in the reflect or runtime packages. Although it is not currently done, you can check if it is working with init() functions, etc., and if not, fall back to the standard library, etc., to eliminate the fear of accessing features that are not exported.

  • mapassign_faststr

When creating mapassign_faststr aliases from reflect in the above manner, looking at a lower layer implementation (I noticed by following the stack trace when it panicked) that after allocating the location, a copy is generated from the value and inserted I found out that it was inserting by generating a copy from the value after allocating the location. Instead, I created an alias from runtime, allocated, and then assigned directly to that pointer, which greatly reduced the allocation.

No responses yet