Native API#

The native simdjson API offers significant performance improvements over the builtin-compatible API if only part of a document is of interest.

Objects and arrays are returned as fake dicts (Object) and lists (Array) that delay the creation of Python objects until they are accessed.

class simdjson.Parser#

A Parser instance is used to load and/or parse a JSON document.

A Parser can be reused to parse multiple documents, in which case it wil reuse its internal buffer, only increasing it if needed.

Parameters:

max_capacity – The maximum size the internal buffer can grow to. [default: SIMDJSON_MAXSIZE_BYTES]

get_implementations(self, supported_by_runtime=True)#

A list of available parser implementations in the form of [(name, description),…].

By default, this only returns the implementations that are usable on the current runtime. Setting supported_by_runtime to False will instead return all the implementations _compiled_ into this build of simdjson.

implementation#

The active parser Implementation as (name, description). Can be any value from implementations. The best Implementation for your current platform will be picked by default.

Can be set to the name of any valid Implementation to globally change underlying Parser Implementation, such as to disable AVX-512 if it is causing down-clocking.

load(self, path, bool recursive=False)#

Load a JSON document from the file system path path.

If any Object or Array proxies still pointing to a previously-parsed document exist when this method is called, a RuntimeError may be raised.

Parameters:
  • path – A filesystem path.

  • recursive – Recursively turn the document into real python objects instead of pysimdjson proxies.

parse(self, src, bool recursive=False)#

Parse the given JSON document.

The source document may be a str, bytes, bytearray, or any other object that implements the buffer protocol.

Performance

While you can pass quite a few things to this method to be parsed, simple bytes will almost always be the fastest.

If any Object or Array proxies still pointing to a previously-parsed document exist when this method is called, a RuntimeError may be raised.

Parameters:
  • src – The document to parse.

  • recursive – Recursively turn the document into real python objects instead of pysimdjson proxies. [default: False]

class simdjson.Array#

A proxy object that behaves much like a real list().

Python objects are not created until an element in the list is accessed. When you only need a subset of an Array, this can be much faster than converting an entire array (and all of its children) into real Python objects.

as_buffer(self, *, of_type)#

Copies the contents of a homogeneous array to an object that can be used as a buffer. This means it can be used as input for numpy.frombuffer, bytearray, memoryview, etc.

When n-dimensional arrays are encountered, this method will recursively flatten them.

Note

The object returned by this method contains a copy of the Array’s data. Thus, it’s safe to use even after the Array or Parser are destroyed or reused.

Parameters:

of_type – One of ‘d’ (double), ‘i’ (signed 64-bit integer) or ‘u’ (unsigned 64-bit integer).

as_list(self)#

Convert this Array to a regular python list, recursively converting any objects/lists it finds.

at_pointer(self, json_pointer)#

Get the value at the given JSON pointer.

mini#

Returns the minified JSON representation of this Array.

Return type:

bytes

class simdjson.Object#

A proxy object that behaves much like a real dict().

Python objects are not created until an element in the Object is accessed. When you only need a subset of an Object, this can be much faster than converting an entire Object (and all of its children) into real Python objects.

as_dict(self)#

Convert this Object to a regular python dictionary, recursively converting any objects or lists it finds.

at_pointer(self, json_pointer)#

Get the value at the given JSON pointer.

get(self, key, default=None)#

Return the value of key, or default if the key does not exist.

items(self)#

Returns an iterator over all the (key, value) pairs in this Object.

keys()#

Returns an iterator over all keys in this Object.

mini#

Returns the minified JSON representation of this Object.

Return type:

bytes

values(self)#

Returns an iterator over of all values in this Object.

Constants#

simdjson.MAXSIZE_BYTES: int = 4294967295#

The maximum document size (in bytes) supported by simdjson.

simdjson.PADDING: int = 32#

The amount of padding needed in a buffer to parse JSON.

In general, pysimdjson takes care of padding for you and you do not need to worry about this.

simdjson.VERSION: str#

The version of the embedded simdjson library.