Performance =========== pysimdjson is fast, typically tying or beating all other Python JSON libraries when simply using :func:`simdjson.loads()` or :func:`simdjson.load()`. However, 95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want. pysimdjson also has optimizations for loading homogeneous arrays into tools like `numpy`_ via :func:`simdjson.Array.as_buffer()`. This is typically at least 8x faster than other methods. Don't load the entire document ------------------------------ pysimdjson supports this in two ways - the use of JSON pointers via `at_pointer()`, or proxies for objects and lists. .. code:: python import simdjson parser = simdjson.Parser() doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]}') For our sample above, we really just want the second entry in `res`, we don't care about anything else. We can do this two ways: .. code:: python assert doc['res'][1]['name'] == 'second' # True assert doc.at_pointer('/res/1/name') == 'second' # True Both of these approaches will be much faster than using `load/s()`, since they avoid loading the parts of the document we didn't care about. Both `Object` and `Array` have a `mini` property that returns their entire content as a minified Python `str`. A message router for example would only parse the document and retrieve a single property, the destination, and forward the payload without ever turning it into a Python object. Here's a (bad) example: .. code:: python import simdjson @app.route('/store', methods=['POST']) def store(): parser = simdjson.Parser() doc = parser.parse(request.data) redis.set(doc['key'], doc.mini) With this, doc could contain thousands of objects, but the only one loaded into a python object was `key`, and we even minified the content as we went. Re-use the parser ----------------- One of the easiest performance gains if you're working on many documents is to re-use the parser. .. code:: python import simdjson parser = simdjson.Parser() for i in range(0, 100): doc = parser.parse(b'{"a": "b"}') del doc This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit. .. _numpy: https://numpy.org/