V8 Quick Dive
Overview
Over the last week, I became curious about where to find the source for Javascript built-in methods such as array concat, push, pop, etc. I came across a few Stackoverflow articles, most of which were out of date. However, they did put me on the right track. This article summarizes where I ended up.
Bottom to Top
The first way we can go about this is to start from the depths. The naming of the function itself seems to follow a pattern like ArrayConcat*, where * was something outdated in the Stackoverflow article. Instead, navigating to the code repo’s search bar allows you to type in a search query. Searching for arrayconcat did not work out, but CamelCasing it did give me a result for ArrayConcatVisitor. This is located inside v8/src/builtins/builtins-array.cc.
By looking around in this builtins-array.cc file, you can see the C++ implementation of the array, and all of its methods. And it was done by multiple developers or at different time periods, because the naming conventions and comment patterns are not the same. On one hand we have ArrayConcatVistor, with a comment that says “Array Concat —” and on the other hand we have `BUILTIN(ARRAY*) for other methods.
By spotting that pattern, you can search in this file for BUILTIN( and see (most of?) the built-in methods, and interestingly, there’s BUILTIN(ArrayConcat). Why that didn’t show up in the search bar, I have no idea, but here is some pasta of that code:
// ES6 22.1.3.1 Array.prototype.concat
BUILTIN(ArrayConcat) {
HandleScope scope(isolate);
Handle<Object> receiver = args.receiver();
ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
isolate, receiver,
Object::ToObject(isolate, args.receiver(), "Array.prototype.concat"));
args.set_at(0, *receiver);
Handle<JSArray> result_array;
// Avoid a real species read to avoid extra lookups to the array constructor
if (V8_LIKELY(receiver->IsJSArray() &&
Handle<JSArray>::cast(receiver)->HasArrayPrototype(isolate) &&
Protectors::IsArraySpeciesLookupChainIntact(isolate))) {
if (Fast_ArrayConcat(isolate, &args).ToHandle(&result_array)) {
return *result_array;
}
if (isolate->has_pending_exception())
return ReadOnlyRoots(isolate).exception();
}
// Reading @@species happens before anything else with a side effect, so
// we can do it here to determine whether to take the fast path.
Handle<Object> species;
ASSIGN_RETURN_FAILURE_ON_EXCEPTION(
isolate, species, Object::ArraySpeciesConstructor(isolate, receiver));
if (*species == *isolate->array_function()) {
if (Fast_ArrayConcat(isolate, &args).ToHandle(&result_array)) {
return *result_array;
}
if (isolate->has_pending_exception())
return ReadOnlyRoots(isolate).exception();
}
return Slow_ArrayConcat(&args, species, isolate);
}
This is actually the last method in a 1500 LoC file. But we can see a function call that actually passes Array.prototype.concat as an argument. Also of interest is that searching for Array.prototype. shows up nothing else in the file. Some other means is actually attaching the method implementations here to the actual array prototype. Time to go from the other direction.
Top to Bottom
In another Stackoverflow article, there was mention of a bootstrapper. Since these articles fall out of date, the link in the article went nowhere, but another search within the repo lands you with the bootstrapper. Again a search within this file for ArrayConcat gives you a function call: SimpleInstallFunction(isolate_, proto, "concat", Builtins::kArrayConcat, 1, false);. Clearly this has something to do with how the implementations are attached to the prototype object. proto is declared with:
// Set up %ArrayPrototype%.
// The %ArrayPrototype% has TERMINAL_FAST_ELEMENTS_KIND in order to ensure
// that constant functions stay constant after turning prototype to setup
// mode and back.
Handle<JSArray> proto = factory->NewJSArray(0, TERMINAL_FAST_ELEMENTS_KIND,
AllocationType::kOld);
Again, this gives us hints as to how to find other related implementations. For example, try searching for Set up of Prototype%. As I said before, this file was not done by one programmer, or not by one programmer all at once. There are many conventions being used inconsistently, such as Create the %NumberPrototype%, Setup, but you get used to coming up with the right search term; Prototype%` works the best at seeing where all the prototypes are defined for all primitive/built-in types.
Above, I have yet to find out what they mean by isolate_, though it’s clear they are using it as a noun. And the methods are being referred to as Builtins::kNumberIsInteger. I haven’t looked into how that import chain works, but now that we have the top and the bottom, it’s a matter of tracing through the method calls, imports, until we can map a full chain of how a piece of C++ code runs when you call the method on the built-in’s prototype object.
That was easy, now the hard part
I describe the above as easy because we are just looking for an implementation of a method defined on a built-in data structure. But if we really want to start getting sophisticated, the next exercise would be
- find out how the prototypes are implemented in the first place
- find out how the Array itself is implemented (the internal data structure)
Both of these pursuits are worthy of inspection, one to find out how a language feature is defined, and another to find out how data is defined. One could imagine creating new language features as a sibling to the prototype (probably how classes were implemented), or adding a new data structure (a linked list for example), or optimizing an existing data structure (can we instantiate arrays more efficiently?). And we still haven’t completed the path above, that would probably be step 0.
Aside: V8_NOINLINE
Since I’m not a C/C++ developer, some of these declarations still seems strange to be, but finding, for example, documentation or clarification about V8_NOINLINE is as simple as a google search leading to https://v8docs.nodesource.com/node-13.2/d7/dfc/v8config_8h.html#a24a5c8b6c341efc8dc6de3e6d0d73a50 which leads to https://v8docs.nodesource.com/node-13.2/d7/dfc/v8config_8h_source.html#l00355
I’ll save C/C++ preprocessor definitions for another time.
References
- here’s the github mirror
- a nice codeless but thorough writeup of this subject: https://ryanpeden.com/how-do-javascript-arrays-work-under-the-hood/
- Mozilla Firefox SpiderMonkey repo