Stringification and Serialization of Complex JavaScript Objects

The Wilburys know what it takes to stringify Maps, Sets, ArrayBuffers, and DataViews. They have to be turned inside out. A stringification algorithm, which transverses subobjects/subproperties in a preorder transversal, only naturally deals with external properites, not internal properties. So the internal properties are turned inside out to become external properties that the algorithm can deal with.

// For example turn a class instance // of Set, say sourceSet, inside out. const proxySource = {}; var n = 0; sourceSet.forEach(function (value1, value2, theSet) { // Turn internal state properties of sourceSet // into external properties of proxySource. proxySource[n++] = value2; }); // proxySource will replace sourceSet in the stringification algorithm. // So we also need to copy the external properties of sourceSet over // to external properties of proxySource. const keys = Object.getOwnPropertyNames(originalSource); // Now transfer the external properties of originalSource // to external properties of proxySource. for (let i = 0; i < keys.length; i++) { const clam = isDigitString(keys[i])? '!' :''; proxySource[clam + keys[i]] = originalSource[keys[i]]; // Not a clam. An exclam! Sung to the tune of // Rock Lobster. // If an external property of sourceSet is a number/string // of digits then it is escaped with a clam on line 7. Thats // because digits as public properties of proxySource are use to // represent internal properties of sourceSet (See line 3). }

If you want to attempt to understand the source code at Stringify.js, then the article Deep Copies with Circular References is a prerequisite since it introduces The Basic and Powerful Preorder Transversal which is used in Stringify.js, parseDataString.js, deepCopyData.js, and deepCopy.js. It's a general technique that should be studied by the sorts of programmers who study those sorts of things.

Introduction

The title and following paragraphs are shameless promotion. But it needs to be made clear that you should be excited because our serialization/deserialization goes way beyond JSON into actually serializing objects that are class instances in such a way that deserialization produces a deep copy of the object with equivalent but separate internal state.

The mundane level also goes way beyond JSON. JSON is rather limited in the types of objects that can be serialized. We serialize all^* built in data types with infinite complexity. E.g., Sets whose members are Maps whose keys are ArrayBuffers and values are typed arrays. The Set can also have properties that are DataViews whose properties are Maps whose properties are Sets whose properties are ArrayBuffers.

(*) It is impossible for any system to serialize symbols, WeakSets, and WeakMaps because there is no way for the programmer to read their internal state.

Serialization is built on stringification. In our system, stringification is just for display/dumping. Our stringification, not originally intended for serialization, was easy to modify for serialization. Our dtype() function wasn't designed for serialization either, but its detailed design gave it an automatically small but important role in checking that certain objects are suitable for serialization.

Both serialization/deserialization and stringification handles circular and duplicate references. Stringification and hence serialization encodes such references so that deserialization can decode them and reproduce the circular and duplicate references.

Two nodes in the object tree of x that are equal are circular references if one is the ancestor of the other. Otherwise they are duplicate references. Naive algorithms for various tasks will loop infinitely with circular references, but not with duplicate references.

Serialization/Deserialization preserves property descriptors by encoding/decoding them. For example if Y is a deep copy of X through serializtion/deserialization then the property descriptor of Y.a.b.c in Y.a.b is the same as the property descriptor of X.a.b.c in X.a.b.

The following sections discuss stringification. However, after that, be sure to read the serialization appendix because it introduces terms necessary in two subsequent articles. One is the article on parseDataString(), which is our deserialization function. The other is the article on serializing classes.

Syntax of stringify()

Defalt values are underlined and the question mark indicates existence is optional.

stringify(object, param?)
- object: the required object to be stringified
- param = {doHost?:true|false, writeGetsSets?:true|false, doAbbreviate:true|false}?:
  - doHost: If true then host objects in the stream will be stringified in full rather than just noting the host object and stopping.
  - writeGetsSets: If true then getters/setters will be written rather than the value of the getter. Processing of child nodes stops with a getter/setter because a getter/setter has no child nodes.
    If false, then the value returned by the getter is written and processing continues for any of its child nodes.
  - doAbbreviate: If true then the caller property of functions are ignored. If false, not ignored. Showing caller properties is interesting but can drastically increase string size, and also is not suitable for deserialization.

There's actually more attributes that param can have, but they are for internal use in our library.

Object and Circular References

var x = {q:100}, y = {q:101}; x.a = y; y.b = x; var z = {M:x, N:y}; stringify(z) = Object \^04{ M:Object { q:100, a:Object { q:101, b:@(2)Object(top.M) } }, N:#Object(top.M.a) }\$04

First top referes to the top node, in this case z.

We first read that dtype(top) is "Object". This doesn't mean that top can be any old object. It precisely means that the internal prototoype of top is Object.prototype. top is one degree away from Object.prototype. Likewise top.M, top.M.a, and top.N all have "Object" as their dtype() and hence all are one degree away from Object.prototype.

The @ symbol indicates a circular reference. The right side of top.M.a.b indicates that top.M.a.b is a circular reference to top.M an "Object" named M two levels up. That is top.M.a.b and top.M are the same.

The # symbol indicates a dupiclate reference. top.N is a duplicate reference to top.M.a. That is top.N and top.M.a are the same.

Higher Degree Objects and Primitive Properties

var x = {a:1, b:2}; // one degree away from Object.prototype var y = Object.create(x); // two degrees away from Object.prototype var z = Object.create(y); // three degrees away from Object.prototype // Define some primitive properties z.a = null; z.b = undefined; z.c = true; z.d = 7; z.e = "cat"; z.f = 900719925474099267n; // BigInt literal z.g = Symbol("something"); stringify(z) = Object(3) \^04{ a:null, b:undefined, c:true, d:7, e:"cat", f:BigInt(900719925474099267), g:Symbol(*) }\$04

The dtype of z is "Object(3)" meaning that z is an object three degrees from Object.prototype. That is z.__proto__.__proto__.___proto__ is Object.prototype.

Since it is impossible to read Symbols, we can only notate that z.g is a Symbol.

Elementary Classes

The elementary classes are Boolean, Number, String, Date, and RegExp. Stringifications of their class instances all behave the same.

Class Instances of Elementary Classes

var a = new Boolean(true); a.x = 1; a.y = "cat"; stringify(a) = Boolean(true) \^04{ x:1, y:"cat" };\$04

We read that the dtype of a is Boolean. That means a is a Boolean class instance. That it has a value of true is indicated. Its properties are indicated inside braces. If there were no properties, the braces would be empty.

var b = new Number(5); b.z = a; stringify(b) = Number(5) \^04{ z:Boolean(true) { x:1, y:"cat" } };\$04

The dtype of b is Number, so b is a Number class instance. It has one property z which the the a we've seen before. To the right of z: is the stringification of a.

var c = new String("cat"); stringify(c) = String("cat") \^04{ // length:3 };\$04

The dtype of c is String, so c is a String class instance. Its value is indicated as "cat". It has no properties other than the length, which we omit since it isn't needed for informational purpposes or for deserializtion.

var d = new Date(); strinfify(d) = Date(1616200219563) \^04{ };\$04

The dtype of d is Date, so c is a Date class instance. Its value is indicated next. d has no properties.

var e = /abc/g; e.a = "cat"; stringify(e) = RegExp(/abc/g) \^04{ lastIndex:0, a:"cat" };\^$04

The dtype of e is RegExp, so e is a RegExp class instance. Its value is indicated next. JavaScript gives e the lastIndex property. Our added property a is shown next.

Arrays

The stringification of Arrays behave in a similar manner to stringification of literal objects.

Arrays

var x = ["cat", "dog", "hamster"]; x.a = Object.create(new Boolean(true)); stringify(x) = Array \^04{ 0:"cat", 1:"dog", 2:"hamster", // length:3, // neither informative, nor needed for deserialization a:Boolean[Object(2)] { } };\$04

The dtype of x is Array, so x is a class instance of Array. Of particular note, the stringification of x.a is shown. The dtype of x.a is "Boolean[Object(2)]" meaning that x is an instance of Boolean but not a class instance since it is 2 degrees away from Boolean.prototype. It follows type(x.a) = "Object", which is written for emphasis.

Appendix: Serialization

There are four functions that wrap stringify()

serializeSimpleData()
serializeFData()
compareObjects()
serializeClassData()

The following code shows the first three in action. For serializeClassData() see the class serialization/deserialization article.

Serializing Simple Data and F-Data.
To see this in action see the parseDataString() article,
and the article on serializing classes.

// Serialize and Deserialize Simple Data // x is simple data const string = serializeSimpleData(x); // exception thrown if x is not simple data const y = parseDataString(string); // check that y is a deep copy of x console.log(compareObjects(x,y)); // logs true // Serialize and Deserialize F-Data // x is f-data const string = serializeFData(x); // exception thrown if x is not fdata /* To deserialize there must be a single evaluator e whose context can faithfully reproduce the outer context of every function and getter/setter in the object tree of x. If there is such an evaluator e then the code continues on. To learn about evaluators see the first article on copying functions. */ const y = parseDataString(x, e); // check that y is a deep copy of x console.log(compareObjects(x,y)); // logs true

Simple Data

An element is simple data if every node in its object tree including internal nodes of Sets (members) and Maps (keys and values) are one of the following.

a primitive, excluding symbols
a class instance of a built in class excluding WeakSets, WeakMaps and Functions
an object one degree away from Object.prototype or one degree away from null

If the dtype of an element contains the symbol '[' then the element is not simple-data.

If x and every subobject of x is not a symbol, WeakSet, WeakMap or Function, then x is simple data exactly when dtype(y) does not contain the symbol '[', for y = x, and all subobjects y of x.

FData: FData means the same thing as simple data, except subobjects (not the element itself) may also be functions or getters/setters.
The author could not come up with a better term than FData.

Warning! The results of serializeFData() can only be deserialized by finding an approprite evaluator, which might be difficult when actually possible.

Well OK! See parseDataString Test.js file for serialization and deserialization of fdata. Go to the end of the file and look for TestCircularFunctions() and TestCircularGets().

Stringification and Serialization of Complex JavaScript Objects

Introduction

Syntax of stringify()

Object and Circular References

Higher Degree Objects and Primitive Properties

Elementary Classes

Arrays

Typed Arrays

Maps

WeakMap

Set

WeakSet

ArrayBuffer and DataView

Appendix: Serialization