Native API

The native simdjson API offers significant performance improvements over the builtin-compatible API if only part of a document is of interest.

Objects and arrays are returned as fake dicts (Object) and lists (Array) that delay the creation of Python objects until they are accessed.

class simdjson.Parser

A Parser instance is used to load and/or parse a JSON document.

A Parser can be reused to parse multiple documents, in which case it wil reuse its internal buffer, only increasing it if needed.

Parameters

max_capacity – The maximum size the internal buffer can grow to. [default: SIMDJSON_MAXSIZE_BYTES]

implementation

The active parser implementation as (name, description). Can be any value from implementations. The best implementation for your current platform will be picked by default.

Can be set to the name of any valid implementation to globally change the Parser implementation.

Warning

Setting this to an implementation inappropriate for your platform WILL cause illegal instructions or segfaults at best. It’s up to you to ensure an implementation is valid for your CPU if you choose to override the automatic choice.

implementations

A list of available parser implementations in the form of [(name, description),…].

load(self, path, bool recursive=False)

Load a JSON document from the file system path path.

If any Object or Array proxies still pointing to a previously-parsed document exist when this method is called, a RuntimeError may be raised.

Parameters
  • path – A filesystem path.

  • recursive – Recursively turn the document into real python objects instead of pysimdjson proxies.

parse(self, src, bool recursive=False)

Parse the given JSON document.

The source document may be a str, bytes, bytearray, or any other object that implements the buffer protocol.

Performance

While you can pass quite a few things to this method to be parsed, simple bytes will almost always be the fastest.

If any Object or Array proxies still pointing to a previously-parsed document exist when this method is called, a RuntimeError may be raised.

Parameters
  • src – The document to parse.

  • recursive – Recursively turn the document into real python objects instead of pysimdjson proxies. [default: False]

class simdjson.Array

A proxy object that behaves much like a real list().

Python objects are not created until an element in the list is accessed. When you only need a subset of an Array, this can be much faster than converting an entire array (and all of its children) into real Python objects.

as_buffer(self, *, of_type)

Copies the contents of a homogeneous array to an object that can be used as a buffer. This means it can be used as input for numpy.frombuffer, bytearray, memoryview, etc.

When n-dimensional arrays are encountered, this method will recursively flatten them.

Note

The object returned by this method contains a copy of the Array’s data. Thus, it’s safe to use even after the Array or Parser are destroyed or reused.

Parameters

of_type – One of ‘d’ (double), ‘i’ (signed 64-bit integer) or ‘u’ (unsigned 64-bit integer).

as_list(self)

Convert this Array to a regular python list, recursively converting any objects/lists it finds.

at_pointer(self, json_pointer)

Get the value at the given JSON pointer.

mini

Returns the minified JSON representation of this Array.

Return type

bytes

class simdjson.Object

A proxy object that behaves much like a real dict().

Python objects are not created until an element in the Object is accessed. When you only need a subset of an Object, this can be much faster than converting an entire Object (and all of its children) into real Python objects.

as_dict(self)

Convert this Object to a regular python dictionary, recursively converting any objects or lists it finds.

at_pointer(self, json_pointer)

Get the value at the given JSON pointer.

get(self, key, default=None)

Return the value of key, or default if the key does not exist.

items(self)

Returns an iterator over all the (key, value) pairs in this Object.

keys()

Returns an iterator over all keys in this Object.

mini

Returns the minified JSON representation of this Object.

Return type

bytes

values(self)

Returns an iterator over of all values in this Object.

Constants

simdjson.MAXSIZE_BYTES: int = 4294967295

The maximum document size (in bytes) supported by simdjson.

simdjson.PADDING: int = 32

The amount of padding needed in a buffer to parse JSON.

In general, pysimdjson takes care of padding for you and you do not need to worry about this.

simdjson.VERSION: str

The version of the embedded simdjson library.