-
Notifications
You must be signed in to change notification settings - Fork 965
Live API v1 #9205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Live API v1 #9205
Conversation
|
Size Report 1Affected Products
Test Logs |
Size Analysis Report 1Affected Products
Test Logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. Would it be messy to store this as some native audio file and then load and convert it to base64 in this file? Would that require a third party library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we wouldn't need a third-party library. In Node we could read the file from disk and encode it, but in the Browser tests I think it'd be tricky to get a hold of the file on disk. Maybe there's a Karma config that will load the file into browser memory.
In my opinion storing it as an already base64 encoded string is the simplest option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to leave it as is (as base64) but for what it's worth, this is how convert-mocks.ts works, it gets called with a yarn testsetup
step before tests run, and it runs in node and uses fs
to open and convert the golden json files to a js file that even the browser tests can import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I could add a integration test set up script to do that...
@@ -128,7 +134,12 @@ export class BrowserWebSocketHandler implements WebSocketHandler { | |||
} | |||
}; | |||
|
|||
const closeListener = (): void => { | |||
const closeListener = (event: CloseEvent): void => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two websocket implementations have a lot of shared code, is it worth having a base class, or a bunch of utility functions they both use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, right now there's only one line that's different (this.ws.binaryType = 'blob'
in Node).
If we were to do this, I think we could share everything except for a initializeWebSocket()
. The reason they're not shared right now is because I am concerned that small runtime differences between Node and Browser WebSocket APIs may appear, and justify adding more platform-specific code. I haven't found any instances of this so far, so maybe this concern isn't justified.
Could we postpone refactoring this until the Live API works with the current implementation? I want to be certain that there are no other runtime differences that I haven't encountered yet in my testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid using a base class for that reason, but it might be good to have importable standalone functions for shared code that would shorten some of the methods and reduce duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I do that in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try it and see if it looks cleaner? I'll leave it up to your judgement. If you do a quick attempt and it feels clunky, worry about optimization later.
|
||
Returns a [LiveGenerativeModel](./ai.livegenerativemodel.md#livegenerativemodel_class) class for real-time, bidirectional communication. | ||
|
||
The Live API is only supported in modern browser windows and Node ><!-- -->= 22. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to emphasize this limitation in the narrative guides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to emphasize this, since it's not very restrictive. The vast majority of browser users are using 'modern browsers'. Node users would be a very small subset of all Live users, and Node 22 is only one version above our minimum supported version.
@hsubox76 What do you think?
Supported environments: https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API#api.websocket
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would leave the Node restriction in. "If using Node, the Live API is only supported in version 22 and greater." or whatever the right phrasing is for >= Otherwise we should change our engines field, and that's effectively changing our firebase minimum to 22 because users who install firebase automatically get @firebase/ai
and all its npm warnings.
* refactor websocket * remove test ws server
Adds support for the Public Preview Live API, with foundational methods to reach parity with Flutter, Android, and Unity.
API
LiveGenerativeModel
class establishes a WebSocket connection and configures aLiveSession
LiveSession
includes foundational methods for live content generation:receive()
,send()
,sendMediaChunks()
,sendMediaStream()
id
property toFunctionCall
andFunctionResponse
, which is used by the Google AI backend to map function responses to calls. This property won't be in Vertex AI responses.Testing
WebSocketHandler
to simulate server messages.