store

package
v0.0.0-...-4ebe608 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 27, 2025 License: Apache-2.0 Imports: 13 Imported by: 0

README

Default JSON store

Summary

The Default JSON Store is designed to efficiently pack values in memory, allowing shared memory for JSON documents and their clones.

When a value is added, a Handle (a 64-bit integer) is returned as a reference, which can either inline small values or point to larger ones stored in a separate slice.

The store uses various packing methods, including BCD encoding for numbers and DEFLATE compression for large strings.

It also optimizes storage for blank spaces in JSON. The store supports serialization for later reuse through gob encoding.

Goals

A JSON store aims to pack values in memory so that a JSON document and its clones may share the same memory place.

Principle

Whenever the caller Put a value, the Store returns a Handle as a reference for future use.

Handles are not pointers, but uint64 integers (so that's the size of a pointer on 64-bit systems).

The idea is that all values do not result in some memory being used by the Store but may be packed in the Handle itself.

In that case, the Store will honor Get requests to restore the original value, but no extra memory is needed.

So the store may not really keep track of most small values: the conveyed Handle will.

The driving principle is to favor a lazy evaluation of values: packing data as much as possible early on, then leaving up the extra CPU decoding work to callers at a later time, on values that actually require to be resolved.

Packing methods

The Handle is 64 bits large. We reserve a 4 bits header to determine the kind of encoding (16 possible methods).

The remaining 60 bits are splits in two different ways:

  • either the value is inlined: 3 bits to store the length in bytes, and the remaining 56 bits to store the payload of up to 7 bytes
  • or the value is stored separately in a single inner []byte slice (the "arena"): 20 bits to store the length, 40 bits used for the offset in the inner slice

This way, the adressable space in the arena is quite large with about 1TB. The length of a single value is also rather large (1 MB).

Numbers are always encoded in BCD (a modified version of the standard BCD, to accomodate for a decimal separator and the exponent (e or E) used in scientific notation.

There are reserved Handle values for null, true and false so these never take additional memory beyond the uint64 handle.

Inlined values support payloads of up to 7 bytes. So we may inline numbers with up to 14 digits (resulting in 7 BCD nibbles) or strings of up to 7 bytes.

There is a special handling for ASCII only short strings. When a string is ASCII only, we may pack up to 8 original bytes in the payload (each ASCII character is encoded on 7 bits).

Very large numbers or strings of moderate length do require memory allocation in the inner area.

Here we may apply DEFLATE compression for large strings (by default, more than 128 bytes).

At the moment, we don't apply DEFLATE compression for XXL numbers (dealing with more than 128 BCD nibbles should be a rare event).

Packing blank space

Valid blank space characters in JSON are: blank (0x32), tab, carriageReturn and lineFeed.

Our default lexer interprets those as non-significant blank space that occur before a token.

const (
	blank          = ' '
	tab            = '\t'
	lineFeed       = '\n'
	carriageReturn = '\r'
)

Any blank string is therefore an arbitrary mix of these 4 characters. There is a potential to get a higher compression ratio than a standard DEFLATE.

  • ascii-only string => up to 8 bytes may be packed in the inline 7-bytes payload
  • blanks-only (4 values -> 2 bits) => 7x4=28 blank characters may be packed in the inline payload
  • the length field requires 2 extra bits (4 -> 6 bits), so the payload is slightly reduced to 54 bits instead of 56: up to 27 blanks may be inlined.

That means that many ordinary JSON (i.e. with up to 27 indentation strings between tokens) would not require any extra-memory in the arena to store blanks.

  • blank strings with a length greater than 27 bytes are compressed using DEFLATE and stored in an "arena" dedicated to such compressed blanks.

There are probably many special cases for optimizing the storage of blanks (e.g. "well-formed indentation: \n followed by n spaces").

Getting blank as a stores.Value

Non-significant space is, er, not significant and therefore not really a value.

Since the Store leaves a lot to the caller, we'll return a String value and leave it to the caller to know when this is a blank.

The VerbatimStore supports VerbatimValue s, which keep both Value and blanks.

The Write call will know the difference and send a raw string instead of a JSON string.

Callers should be aware that verbatim tokens not holdings values (such as separators or EOF) may also come with non-significant blank space. For these, the VerbatimStore may just store blanks with the Blanks method.

Serialization

We might want to save a given store on disk for reuse at a later time.

The Store supports gob encoding with MarshalBinary/UnmarshalBinary.

Documentation

Overview

Package store provides default implementations for stores.Store.

It exposes a Store type to pack JSON values in memory.

An additional ConcurrentStore implementation supports concurrent access using ConcurrentStore.Get and [ConcurrentStore.Put].

The VerbatimStore implements stores.VerbatimStore, to allow users to keep non-significant blank space and reconstruct JSON documents verbatim.

Index

Constants

View Source
const ErrStore storeError = "json document store error"

Variables

This section is empty.

Functions

func RedeemStore

func RedeemStore(s *Store)

RedeemStore redeems a previously borrowed Store to the pool.

Types

type CompressionOption

type CompressionOption = func(*compressionOptions)

CompressionOption alter default settings from string compression inside the Store.

func WithCompressionLevel

func WithCompressionLevel(level int) CompressionOption

func WithCompressionThreshold

func WithCompressionThreshold(threshold int) CompressionOption

type ConcurrentStore

type ConcurrentStore struct {
	*Store
	// contains filtered or unexported fields
}

ConcurrentStore is a stores.Store just like Store and may be used concurrently.

Concurrency

It safe to retrieve values concurrently with store.Get, and have several go routines storing content concurrently.

Although it is safe to use store.WriteTo concurrently, it should not be used that way, as the result is not deterministic.

func NewConcurrent

func NewConcurrent(opts ...Option) *ConcurrentStore

func (*ConcurrentStore) Get

Get a values.Value from a stores.Handle.

func (*ConcurrentStore) Len

func (s *ConcurrentStore) Len() int

Len returns the current size in bytes of the inner memory arena.

func (*ConcurrentStore) PutToken

func (s *ConcurrentStore) PutToken(tok token.T) stores.Handle

PutToken puts a value inside a token.T and returns its stores.Handle for later retrieval.

func (*ConcurrentStore) PutValue

func (s *ConcurrentStore) PutValue(v values.Value) stores.Handle

PutValue puts a values.Value and returns its stores.Handle for later retrieval.

func (*ConcurrentStore) WriteTo

func (s *ConcurrentStore) WriteTo(writer writers.StoreWriter, h stores.Handle)

WriteTo writes the value pointed to be the stores.Handle to a JSON writers.StoreWriter.

This avoids unnessary allocations when transferring the value to the writer.

type Option

type Option = func(*options)

Option alters the default settings of a store (Store, ConcurrentStore or VerbatimStore).

func WithArenaSize

func WithArenaSize(size int) Option

WithArenaSize sets the initial capacity of the inner arena that stores large values.

func WithBytesFactory

func WithBytesFactory(bytesFactory func() []byte) Option

WithBytesFactory affects how Get allocates the returned buffer.

func WithCompressionOptions

func WithCompressionOptions(opts ...CompressionOption) Option

func WithEnableCompression

func WithEnableCompression(enabled bool) Option

WithEnableCompression enables compression of long strings in the Store.

Compression is enabled by default and uses the DEFLATE compression method implemented by the standard library package compress/flate.

By default, compression kicks in for strings longer than 128 bytes.

The default compression level is [flate.DefaultCompression), which corresponds to a compression level of 6.

Compression may be disabled or altered using WithCompressionOptions with some CompressionOption s.

func WithPooledBytesFactory

func WithPooledBytesFactory(pooledBytesFactory func() *pools.Slice[byte]) Option

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store is the default implementation for stores.Store.

It acts an in-memory store for JSON values, with an emphasis on compactness.

Concurrency

It safe to retrieve values concurrently with store.Get, but it is unsafe to have several go routines storing content concurrently.

store.WriteTo should not be used concurrently.

func BorrowStore

func BorrowStore(opts ...Option) *Store

BorrowStore borrows a new or recycled Store from the pool.

func New

func New(opts ...Option) *Store

New Store.

See Option to alter default settings.

func (*Store) Get

func (s *Store) Get(h stores.Handle) values.Value

Get a values.Value from a stores.Handle.

func (*Store) Len

func (s *Store) Len() int

func (Store) MarshalBinary

func (s Store) MarshalBinary() ([]byte, error)

func (*Store) PutBool

func (s *Store) PutBool(b bool) stores.Handle

PutNull is a shorthand for putting a bool value.

func (*Store) PutNull

func (s *Store) PutNull() stores.Handle

PutNull is a shorthand for putting a null value. The returned stores.Handle is always 0.

func (*Store) PutToken

func (s *Store) PutToken(tok token.T) stores.Handle

PutToken puts a value inside a token.T and returns its stores.Handle for later retrieval.

func (*Store) PutValue

func (s *Store) PutValue(v values.Value) stores.Handle

PutValue puts a values.Value and returns its stores.Handle for later retrieval.

func (*Store) Reset

func (s *Store) Reset()

Reset the Store to its initial state.

This is useful to recycle Store s from a memory pool.

Implements pools.Resettable.

func (*Store) UnmarshalBinary

func (s *Store) UnmarshalBinary(data []byte) error

func (*Store) WriteTo

func (s *Store) WriteTo(writer writers.StoreWriter, h stores.Handle)

WriteTo writes the value pointed to be the stores.Handle to a JSON writers.StoreWriter.

This avoids unnessary buffering when transferring the value down to the writer.

type VerbatimStore

type VerbatimStore struct {
	*Store
	// contains filtered or unexported fields
}

VerbatimStore is like Store, but with the ability to store and retrieve non-significant blank space, such as indentation, space before commas, line feeds, etc.

This stores.VerbatimStore is designed to hold and reconstruct verbatim JSON documents. It not safe to use concurrently.

JSON blanks

Valid blank space characters in JSON are: blank, tab, carriageReturn and lineFeed.

The generalized notion of blank space in unicode does not apply (e.g. with unicode property "WSpace = Y") and should result in invalid tokens when parsing your JSON.

func NewVerbatim

func NewVerbatim(opts ...Option) *VerbatimStore

func (*VerbatimStore) Get

Get a values.Value from a stores.Handle.

func (*VerbatimStore) GetVerbatim

func (*VerbatimStore) PutBlanks

func (s *VerbatimStore) PutBlanks(blanks []byte) stores.Handle

func (*VerbatimStore) PutVerbatimToken

func (s *VerbatimStore) PutVerbatimToken(tok token.VT) stores.VerbatimHandle

func (*VerbatimStore) PutVerbatimValue

func (s *VerbatimStore) PutVerbatimValue(v values.VerbatimValue) stores.VerbatimHandle

func (*VerbatimStore) Reset

func (s *VerbatimStore) Reset()

Reset the store so it can be recycled. Implements pools.Resettable.

func (*VerbatimStore) WriteTo

func (s *VerbatimStore) WriteTo(writer writers.StoreWriter, h stores.Handle)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL