Skip to content

Add adapter.[[current]] #1523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 13, 2021
Merged

Add adapter.[[current]] #1523

merged 6 commits into from
Apr 13, 2021

Conversation

kainino0x
Copy link
Contributor

@kainino0x kainino0x commented Mar 16, 2021

Replaces #1477.

Goal: Make requestDevice() behave more similarly across scenarios, so
developers don't accidentally use it unportably: make "less extreme"
scenarios (like device.destroy()) look the same as "more extreme"
scenarios (like eGPU unplug or TDR). This prevents developers from
accidentally writing code that works in less extreme cases but fails in
more extreme cases.

Approach: Add an internal adapter.[[current]] flag. If it's false, the
adapter cannot create a device. It starts as true and gets set to false
on device loss and system state changes. It never changes back to true:
the app must call requestAdapter(), which returns new adapter objects
(not reusing existing ones).

Also note the following:

  • When a device is lost, only its adapter gets invalidated. Applications could technically go pick another adapter from a pre-built list, but I don't think they're likely to do that accidentally
  • Unlike what we discussed in the meeting, device.destroy() does still invalidate the adapter. (It's no longer a hazard because we return new adapters every time and device loss doesn't invalidate other adapters.)

💥 Error: 500 Internal Server Error 💥

PR Preview failed to build. (Last tried on Apr 13, 2021, 2:33 AM UTC).

More

PR Preview relies on a number of web services to run. There seems to be an issue with the following one:

🚨 HTML Diff Service - The HTML Diff Service is used to create HTML diffs of the spec changes suggested in a pull request.

🔗 Related URL

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator at 
 [email protected] to inform them of the time this error occurred,
 and the actions you performed just before this error.</p>
<p>More information about this error may be available
in the server error log.</p>
</body></html>

If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.

Goal: Make `requestDevice()` behave more similarly across scenarios, so
developers don't accidentally use it unportably: make "less extreme"
scenarios (like `device.destroy()`) look the same as "more extreme"
scenarios (like eGPU unplug or TDR). This prevents developers from
accidentally writing code that works in less extreme cases but fails in
more extreme cases.

Approach: Add an internal `adapter.current` flag. If it's false, the
adapter cannot create a device. It starts as true and gets set to false
on device loss and system state changes. It never changes back to true:
the app must call requestAdapter(), which returns new adapter objects
(not reusing existing ones).
@kainino0x kainino0x mentioned this pull request Mar 18, 2021
16 tasks
Copy link
Contributor

@kvark kvark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@kainino0x kainino0x changed the title Add adapter.current Add adapter.[[current]] Mar 29, 2021
@kainino0x
Copy link
Contributor Author

@RafaelCintron @litherum do you have any opinions on this?

call, the user agent *should* [=invalidate adapters=]. For example:

- A physical adapter is added/removed (via plug, driver update, TDR, etc.)
- The system's power configuration has changed (laptop unplugged, power settings changed, etc.)
Copy link
Contributor

@RafaelCintron RafaelCintron Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think the scenarios under "system's power configuration has changed" should always trigger WebGPU context lost unless the browser specifically wants content to be moved from the high power adapter to the low power adapter.

If the WebGPU content is on a low power adapter, it should remain "current" regardless of the laptop being unplugged or power settings changing.

Perhaps change the wording from "should" to "could" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't trigger device loss, it just puts the adapter into a state where it cannot vend new devices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier in the PR, it says:

Any time the user agent needs to revoke access to a device, it calls [=lose the device=]

The PR doesn't go into detail about when "invalidation" is different than "revoking access" so I thought (incorrectly) that those were the same thing. This could use more clarification.

However, unless I am misunderstanding the proposed wording, there's still more information disclosure happening than necessary. If the system's power configuration has changed and an adapter that used to be on the old list is also on the new list, we shouldn't set its "current" flag to false. Right?

Copy link
Contributor Author

@kainino0x kainino0x Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No GPUAdapter object will ever be returned more than once. Every time you call requestAdapters you get new GPUAdapter objects even if they refer to the same underlying hardware.

The PR doesn't go into detail about when "invalidation" is different than "revoking access" so I thought (incorrectly) that those were the same thing. This could use more clarification.

ACK Renamed to clarify

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Then by the same token, if the system's power configuration has changed and an OS adapter that used to be on the old list is also on the new list, then I think all of the WebGPU adapter instances corresponding to the OS adapter should continue to handle out devices. Do you agree?

Copy link
Contributor

@RafaelCintron RafaelCintron Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefit would there be to allowing applications to continue vending devices from those adapters?

Because those adapters are still perfectly fine to use.

If I unplug my hybrid laptop and WebGPU is running perfectly fine on the battery saving adapter, the web developer should be none-the-wiser to my actions. With your PR, the web developer can detect what I've done by noticing that the WebGPU adapter object for the low power adapter no longer gives out WebGPU devices. Giving this information to WebGPU developers does not help them write better WebGPU programs so we shouldn't give it out.

Now if I unplug my hybrid laptop and WebGPU is running on the power consuming adapter, I'm more comfortable having the power consuming WebGPU adapter object stop giving out WebGPU devices. Here, there's a user benefit to giving developers this information, so it's worth doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to resolve that by marking adapters stale more often, not less.

I used to have that written into the spec but it didn't make it into this revision of the PR, something like:

mark adapters stale may be scheduled at any time, and user agents may choose to do this often (e.g. on a timer), even when there has been no system state change. This has no effect on well-formed applications and makes developers aware that calling requestAdapters again is always necessary if a new device is desired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the spec

Copy link
Contributor

@RafaelCintron RafaelCintron Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the browser randomly mark adapters stale for no use benefit seems overkill.

If we're going to have a "current" flag, I think it's sufficient for the spec to simply say: "Similar to context-lost, developers should be prepared to encounter adapters where the current flag has becomes false during the course of the WebGPU program. Examples of cases where the current flag could becomes false include: power state changes, and physical adapters becoming added/removed."

Taking a step back, when will it be the case that an adapter becomes non-current but does NOT become lost? If a developer discovers the adapter they're currently using is non-current but is happily accepting draw commands, should they take that as a hint to move their operation somewhere else? Is it an omen of possible calamity ahead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the browser randomly mark adapters stale for no use benefit seems overkill.

There is a benefit though: it makes it much harder to accidentally write applications which fail in rarer situations like system setting changes or eGPU usage. We can't rely on people to read every piece of documentation thoroughly.

Taking a step back, when will it be the case that an adapter becomes non-current but does NOT become lost? If a developer discovers the adapter they're currently using is non-current but [a device on the adapter] is happily accepting draw commands, should they take that as a hint to move their operation somewhere else? Is it an omen of possible calamity ahead?

(Added bracketed part just to make sure I am understanding correctly.)
In my proposal it could mean nothing or it could mean that the UA might decide to lose the device later if the device lives too long and the UA wants to allow the discrete GPU to get powered down. It's intentional that no conclusive information is conveyed. If we want to give apps advance warning of a device being lost, it should be done deliberately, as an event, IMO.

: <dfn>\[[current]]</dfn>, of type boolean
::
Indicates whether the adapter is allowed to vend new devices at this time.
Its value may change at any time.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this from "only outside of user tasks" to "at any time" because this state is not part of the content process, so changing it between user tasks doesn't make sense.

@github-actions
Copy link
Contributor

Previews, as seen when this build job started (29a35fd):
WebGPU | IDL
WGSL
Explainer

@kainino0x
Copy link
Contributor Author

Based on the feedback from the meeting, editors decided to land this. Filed followup #1630

@kainino0x kainino0x merged commit bb938b3 into gpuweb:main Apr 13, 2021
@kainino0x kainino0x deleted the no-adapter-reuse branch April 13, 2021 02:45
ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022
…exture_view_descriptor (gpuweb#902) (gpuweb#1523)

This patch adds the texture_view_descriptor in
'api,validation,capability_checks,features,texture_formats:*' in order to
check if createView throws an exception when the required feature is not enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants