Skip to content

[JSC] add support for Uint8Array.prototype.toBase64 and Uint8Array.prototype.toHex #29039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 4, 2024

Conversation

dcrousso
Copy link
Member

@dcrousso dcrousso commented May 24, 2024

2596e19

[JSC] add support for `Uint8Array.prototype.toBase64` and `Uint8Array.prototype.toHex`
https://bugs.webkit.org/show_bug.cgi?id=274635

Reviewed by Yusuke Suzuki.

These methods will make it easier for developers to encode typed character/byte data.

Note that there are other methods in the related proposal, but for ease of reviewing they will be implemented separately.

Spec: <https://tc39.es/proposal-arraybuffer-base64/spec/>
Proposal: <https://github.com/tc39/proposal-arraybuffer-base64>

* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototype.h:
* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototypeInlines.h:
(JSC::JSGenericTypedArrayViewPrototype<ViewClass>::finishCreation):
* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototype.cpp: Added.
(JSC::uint8ArrayPrototypeToBase64):
(JSC::uint8ArrayPrototypeToHex):

* Source/JavaScriptCore/runtime/CommonIdentifiers.h:
Create builtin identifiers for `alphabet` and `omitPadding`.

* Source/JavaScriptCore/runtime/OptionsList.h:
Add a `useUint8ArrayBase64Methods` feature flag for these new methods (and the others that are part of the proposal that haven't been implemented yet).

* Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj:
* Source/JavaScriptCore/Sources.txt:

* JSTests/stress/uint8array-toBase64.js: Added.
* JSTests/stress/uint8array-toHex.js: Added.

Canonical link: https://commits.webkit.org/280654@main

e28c233

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 wincairo
✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug 🧪 wpe-wk2 ✅ 🧪 wincairo-tests
✅ 🧪 webkitperl 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe
✅ 🧪 ios-wk2-wpt ✅ 🧪 mac-wk1 ✅ 🛠 wpe-cairo
✅ 🛠 🧪 jsc 🧪 api-ios ✅ 🧪 mac-wk2 ✅ 🛠 gtk
✅ 🛠 🧪 jsc-arm64 ✅ 🛠 vision 🧪 mac-AS-debug-wk2 🧪 gtk-wk2
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress 🧪 api-gtk
✅ 🛠 🧪 merge ✅ 🧪 vision-wk2 ✅ 🛠 jsc-armv7
🛠 tv ✅ 🧪 jsc-armv7-tests
✅ 🛠 tv-sim
🛠 watch
✅ 🛠 watch-sim

@dcrousso dcrousso requested a review from a team as a code owner May 24, 2024 05:23
@dcrousso dcrousso self-assigned this May 24, 2024
@dcrousso dcrousso added the JavaScriptCore For bugs in JavaScriptCore, the JS engine used by WebKit, other than kxmlcore issues. label May 24, 2024
@dcrousso
Copy link
Member Author

will add some tests for this in the next few days :)

Comment on lines 39 to 47
JSValue thisValue = callFrame->thisValue();
if (!thisValue.isObject())
return throwVMTypeError(globalObject, scope, "Receiver should be a typed array view but was not an object"_s);

JSObject* thisObject = jsCast<JSObject*>(thisValue);
if (thisObject->type() != Uint8ArrayType)
return throwVMTypeError(globalObject, scope, "Uint8Array.prototype.toBase64 requires that |this| be a Uint8Array"_s);

JSUint8Array* uint8Array = jsCast<JSUint8Array*>(thisObject);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use jsDynamicCast<JSUint8Array*>(thisValue) and check whether it is non null, like

JSUint8Array* uint8Array = jsDynamicCast<JSUint8Array*>(callFrame->thisValue());
if (UNLIKELY(!uint8Array))
     return throwVMTypeError(globalObject, scope, "Uint8Array.prototype.toBase64 requires that |this| be a Uint8Array"_s);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! ive always wondered when to use jsDynamicCast vs jsCast. is there a general rule i could follow to answer that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to check if a JSValue is some type we typically use jsDynamicCast. If you know the JSValue is actually the type already we use jsCast. There are other mechanisms but those are mostly for legacy code.

Comment on lines 84 to 87
JSValue thisValue = callFrame->thisValue();
if (!thisValue.isObject())
return throwVMTypeError(globalObject, scope, "Receiver should be a typed array view but was not an object"_s);

JSObject* thisObject = jsCast<JSObject*>(thisValue);
if (thisObject->type() != Uint8ArrayType)
return throwVMTypeError(globalObject, scope, "Uint8Array.prototype.toHex requires that |this| be a Uint8Array"_s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


JSObject* optionsObject = jsCast<JSObject*>(optionsValue);

JSValue alphabetValue = optionsObject->get(globalObject, Identifier::fromString(vm, "alphabet"_s));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define alphabet in CommonIdentifier

Comment on lines 73 to 76
const uint8_t* data = uint8Array->typedVector();
size_t offset = uint8Array->byteOffset();
size_t length = uint8Array->length();
RELEASE_AND_RETURN(scope, JSValue::encode(jsString(vm, base64EncodeToString({ data + offset, length }, mode))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct. typedVector is already offseted vector pointer. byteOffset is indicating the offset in the original buffer. Not in the typedVector. Can you add a test for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! interesting. i guess i didnt look at this closely enough as i can clearly see this in JSArrayBufferView::ConstructionContext 😅

yes i will add some tests specifically for this scenario

Comment on lines 98 to 102
const uint8_t* data = uint8Array->typedVector();
size_t offset = uint8Array->byteOffset();
size_t length = uint8Array->length();
for (size_t i = offset; i < length; ++i)
builder.append(toStringWithRadix(data[i], 16));
RELEASE_AND_RETURN(scope, JSValue::encode(jsString(vm, builder.toString())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

const uint8_t* data = uint8Array->typedVector();
size_t length = uint8Array->length();
for (size_t i = 0; i < length; ++i)
builder.append(toStringWithRadix(data[i], 16));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is allocating String per byte, which is too costly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incase it's helpful, in Bun we implemented the Node.js API Buffer.prototype.toString("hex"):

https://github.com/oven-sh/bun/blob/aa3aa888d52b8c7546ac365d56a0cd5c990cb7d8/src/string_immutable.zig#L3949-L4038

It uses SIMD and is around 10x faster than Node.js' scalar implementation

validateTypedArray(globalObject, uint8Array);

StringBuilder builder;
builder.reserveCapacity(uint8Array->length());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't one byte become 2 characters? (255 => ff)

Comment on lines 54 to 63
JSValue alphabetValue = optionsObject->get(globalObject, vm.propertyNames->alphabet);
RETURN_IF_EXCEPTION(scope, { });
if (!alphabetValue.isUndefined()) {
if (!alphabetValue.isString())
return throwVMTypeError(globalObject, scope, "Uint8Array.prototype.toBase64 requires that alphabet be \"base64\" or \"base64url\""_s);

String alphabet = alphabetValue.toWTFString(globalObject);
if (alphabet == "base64url"_s)
mode = Base64EncodeMode::URL;
else if (alphabet != "base64"_s)
return throwVMTypeError(globalObject, scope, "Uint8Array.prototype.toBase64 requires that alphabet be \"base64\" or \"base64url\""_s);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These things have side-effect since it is invoking toString, [[Get]], so they can detach the typed array.
Doesn't the spec checks the condition after that?

}

shouldBe((new Uint8Array([])).toBase64({alphabet: "base64url"}), "");
shouldBe((new Uint8Array([0])).toBase64({alphabet: "base64url"}), "AA");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just realized these are all missing padding =, which im fixing as a prerequisite in #29530 (which should also help if/when omitPadding is approved in TC39 plenary later this month)

Copy link
Member

@Constellation Constellation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me

@dcrousso dcrousso added the merge-queue Applied to send a pull request to merge-queue label Jul 4, 2024
….prototype.toHex`

https://bugs.webkit.org/show_bug.cgi?id=274635

Reviewed by Yusuke Suzuki.

These methods will make it easier for developers to encode typed character/byte data.

Note that there are other methods in the related proposal, but for ease of reviewing they will be implemented separately.

Spec: <https://tc39.es/proposal-arraybuffer-base64/spec/>
Proposal: <https://github.com/tc39/proposal-arraybuffer-base64>

* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototype.h:
* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototypeInlines.h:
(JSC::JSGenericTypedArrayViewPrototype<ViewClass>::finishCreation):
* Source/JavaScriptCore/runtime/JSGenericTypedArrayViewPrototype.cpp: Added.
(JSC::uint8ArrayPrototypeToBase64):
(JSC::uint8ArrayPrototypeToHex):

* Source/JavaScriptCore/runtime/CommonIdentifiers.h:
Create builtin identifiers for `alphabet` and `omitPadding`.

* Source/JavaScriptCore/runtime/OptionsList.h:
Add a `useUint8ArrayBase64Methods` feature flag for these new methods (and the others that are part of the proposal that haven't been implemented yet).

* Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj:
* Source/JavaScriptCore/Sources.txt:

* JSTests/stress/uint8array-toBase64.js: Added.
* JSTests/stress/uint8array-toHex.js: Added.

Canonical link: https://commits.webkit.org/280654@main
@webkit-commit-queue
Copy link
Collaborator

Committed 280654@main (2596e19): https://commits.webkit.org/280654@main

Reviewed commits have been landed. Closing PR #29039 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit 2596e19 into WebKit:main Jul 4, 2024
@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Jul 4, 2024
@dcrousso dcrousso deleted the eng/274635 branch July 4, 2024 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JavaScriptCore For bugs in JavaScriptCore, the JS engine used by WebKit, other than kxmlcore issues.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants