722: Next Level Web APIs. Bluetooth, File Access, Thomas Steiner - Project Fugu episode artwork

EPISODE · Jan 26, 2024 · 1H 2M

722: Next Level Web APIs. Bluetooth, File Access, Thomas Steiner - Project Fugu

from Syntax - Tasty Web Development Treats · host Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Thomas Steiner talks with us about Project Fugu, an effort from Google to enable new classes of applications to run on the web. What is Project Fugu? What are some of Thomas’ favorite APIs to use? What is an IWA vs a PWA? And more! Show Notes 00:32 Welcome 01:52 Who is Thomas Steiner? 02:57 What is the overall goal of Project Fugu? 08:17 When might we see these APIs come to all browsers? 14:10 Do you have examples of companies pushing for an API? 18:53 What happens with the face detection API? 28:33 What is an IWA? 35:17 What is the web transport API for? 37:11 What is MIDI? 41:20 Nintendo Joycon hack 45:28 File handlers in a PWA 50:38 File System Observer API coming soon 56:26 Sick Picks Onnx HuggingFace Chrome for Developers Fugu API Tracker Google I/O 2023 LEGO Education SPIKE Igalia CapCut Descript Better Touch Tool Sick Picks Laser printers Shameless Plugs HowFuguIsMyBrowser Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott:X Instagram Tiktok LinkedIn Threads

Thomas Steiner talks with us about Project Fugu, an effort from Google to enable new classes of applications to run on the web. What is Project Fugu? What are some of Thomas’ favorite APIs to use? What is an IWA vs a PWA? And more! Show Notes 00:32 Welcome 01:52 Who is Thomas Steiner? 02:57 What is the overall goal of Project Fugu? 08:17 When might we see these APIs come to all browsers? 14:10 Do you have examples of companies pushing for an API? 18:53 What happens with the face detection API? 28:33 What is an IWA? 35:17 What is the web transport API for? 37:11 What is MIDI? 41:20 Nintendo Joycon hack 45:28 File handlers in a PWA 50:38 File System Observer API coming soon 56:26 Sick Picks Onnx HuggingFace Chrome for Developers Fugu API Tracker Google I/O 2023 LEGO Education SPIKE Igalia CapCut Descript Better Touch Tool Sick Picks Laser printers Shameless Plugs HowFuguIsMyBrowser Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott:X Instagram Tiktok LinkedIn Threads

NOW PLAYING

722: Next Level Web APIs. Bluetooth, File Access, Thomas Steiner - Project Fugu

0:00 1:02:21
of MATCHES

TRANSCRIPT · AUTO-GENERATED

I hear you're hungry. Cool, I'm starving. Wash those hands, pull up a chair and secure that feed bag. Because it's time to listen to Scott, the Lindsey and West Boss attempt to use human language to converse with and pick the brains of other developers.

I thought there was going to be food. So buckle up and grab that old handle because this ride isn't going to get wild. This is the syntax of her club. Welcome to syntax today.

We've got a good one on good one for you. We have Thomas Steiner on who is a developer relations engineer at Google and he's on the Google Chrome team. And he's here to talk to today about Project Fugu, which is this sort of effort to all let him explain it. But essentially, there's a lot of really cool stuff in the browser from Bluetooth.

And I'll tell you the story of how this podcast came about is, I have this little label printer that I bought on Amazon. It's called the Nimbot D110. And it is really cool because you can use your phone to type in print labels. And I thought, that's cool.

I had it for about a day. And then I go, of course, can I hack it? How does this thing work? I always try to dip into that.

So I spent an evening dipping into the web Bluetooth API. And I was amazed that, A, the web Bluetooth API is the best API for working with Bluetooth. Full stop. I was looking at the Python API for working with Bluetooth.

That sucks. It's so simple with using JavaScript. And B, I was amazed that I was able to connect to it and see all of its capabilities. I didn't get it to print a label just yet, because some Bluetooth devices don't want you to.

They want you to do their app. And some of them want to be open. But Thomas, I'm a message. He's like, love seeing this.

Because that's what you do. So welcome, Thomas. Thanks for coming on. Tell us who you are and what you do.

Sure. Thanks for having me. So yeah, as I said, I'm part of the Chrome team at Google. I'm focused on what we call Project Fugu, and also responsible for WebAssembly.

And yeah, so I was listening to this episode with Ferris to do a couple days ago, weeks ago, maybe. I hear a project for when I was like, wait, that's something that I'm working on. And I'm very interested in. And yeah, then I sort of just reached out on Twitter, send you an ADM.

And yeah, luckily you were welcoming enough of my self-inflation to be part of the show. And yeah, chat about Project Fugu. So yeah, thanks for having me. And glad to be here.

Yeah, you're welcome. Fugu is one of those things that comes up from time on the show, along with things like Houdini, where we're saying like, all right, these are these big, big overarching plans. I guess from an outsider's perspective, they always seem very ambitious. What is the goal, overall goal of Project Fugu?

So essentially what our goal is, if you have an app idea, you should be able to realize this app idea on the web. And typically when it comes to app ideas, it's like, yeah, straightforward, hack, hack, hack, oh, how do I do that? And then in the worst case, you hit this, oh, there's an API gap, as we call it. There's just no way to do whatever.

Get an X signal onto Y. So it's like, oh no, I can't realize this app idea on the web. I need to go native or I need to go electron or whatever, you name it. So Project Fugu is essentially this vision that you should be able to do and realize all your app ideas on the web.

And if that means inventing new web APIs, then sure, we accept the challenge. We try to do it, but try to standardize it. And yeah, in a nutshell, this is what Fugu is about. That's awesome.

I'm going to recommend that you go to fugu-tracker.web.app. I'll link it up. And there's a couple of websites that sort of detail it. But just scrolling through the list of web APIs that are either shipped in some browsers or are sort of being thought about is so cool.

Because at the very basics, I posted a screenshot of me working on the Bluetooth thing. And everyone was like, you can access Bluetooth in the browser. And it's like, well, yeah, in Chrome, they have it. And they've had it for quite a while.

And there's also a web serial port API, which is on the electronics show. I talked about how I bought these little ESP32s. And the process for flashing them is plug it in via USB and visit a website. And it will flash your hardware from it in my Bose headphones.

I can update the firmware on my Bose headphones via a web browser. And that experience, let alone people want to build desktop apps with web tech. That's probably a big part of this as well. But just that experience of people don't have to download some sketchy-ass Windows only exe to run it to be able to update your software is such a nice experience.

With Xtoto, yeah. So there's a couple of new APIs that enable this, experiences like this and others. We had the Pixel bug, but recently, that now have this app that allows you to configure the equalizer and stuff. Because, yeah, as I said, people feel bad if they have to give up whatever their location history change whatever equalizer setting in their headphones.

And the web just makes sure that these kind of things are very visible. So you don't really need to accept this permission if you don't want to. They can obviously still program in a way that requires giving up your location history. But in the worst case, there's just a way to track background location on the web.

So you're not becoming an active target for whoever tracks this data. So yeah, just going web only for these kind of things makes a lot of sense. It's more secure. It's easier to roll out on many devices.

You don't have to have whatever. It's a lot of people who are going to watch this. So it's just way more accessible for a lot of people. And a lot of companies also realize that this is actually making a ton of sense.

We just got out of Google I.O. season. So we had a bunch of events all around the planet. And what we showed at Google I.O.

Connect was LEGO. And LEGO has this really, really cool set of, well, it's not a toy, but it's somewhat is. So it's an educational toy, I guess. They just realized all these schools and universities, they have typically very low powered Chromebooks.

They have relatively simple Windows laptops. They have especially very mixed architecture of devices. And for them, it was one of the best solutions that was on the market to just say, hey, let's go web with our application. It's essentially something like a scratch editor where you can move around blocks and then flash them onto the device so that you can then run your LEGO model, program the hub, and so on.

And yeah, we just showed that LEGO uses this for their production web applications that they ship with whatever LEGO can you buy. Wow. Is that called the LEGO Mindstorm? The Mindstorm was before that.

So the new generation is called Spike. OK. LEGO Spike Education. That's cool.

Speaking on companies, getting on board, I think one of the bigger companies that people criticized for maybe not getting on board with APIs that could maybe reduce the influence of apps would be Apple. Do you see the Apple ecosystem taking these things seriously in your, and I guess maybe speak a little bit on what's the likelihood that all of this stuff comes to all the price? Yeah. Yeah, so that's the elephant in the room.

So WebUSB, WebBluetooth are some of the oldest APIs that we're talking about in this context. So they were launched on Chrome in 2015-ish. And yeah, to your question, how serious do they take them? I would say relatively serious.

So there's W3CT pack, a yearly event where all the browser vendors and industry and researchers and whoever come together. And we discuss hardware APIs regularly there. There's a bunch of browser vendor representatives from Mozilla, from WebKit, from Google, obviously. And we talk about these APIs.

And it's typically a bunch of engineers with needs that want to bring these needs forward on behalf of the users. And what we want is we want to get other browser vendors to see that these use cases exist and that the use cases need to be addressed more safely on the web rather than on native applications, as we said before. So we discuss these APIs regularly at events like T-Pack. And all of these APIs are very, very different thread models.

So something Bluetooth is a different thread model from something WebUSB, from something serial, from something HID or HID. So all of these discussions need to be very, very detailed, which ironically, T-Pack is kind of the worst menu for it because you bring a bunch of people into room. They discuss half of the people who don't really understand the problem just because it's a very specific problem when it comes to a security of USB, whatever. So half of the people zoom out.

And you have a bunch of people very engagedly discussing. And it becomes a relatively, I would say, heated-ish debate relatively quickly just because it's a topic that moves people and makes people take very strong opinions that they hold very weakly and sometimes very strongly. So yeah, I would say it is a very interesting discussion all the time. And of course, we do hope that eventually we can convince vendors like WebKit, like Mozilla, to eventually ship those APIs.

And the first success was had finally with WebMIDI. So I'm talking to MIDI devices over the web. And Mozilla was the first to ship this after Chrome. And what they did is they came up with a solution where they essentially create an on-the-fly extension whenever you want to talk to a WebMIDI device.

And this extension then acts as the barrier between you and the API level signal so that you can send to the MIDI device. So we were hoping maybe something like this could be a way forward to allow other vendors like WebKit maybe to consider that position and say, yeah, this might be something that we can allow for our browsers as well. So yeah, it's a very interesting discussion still ongoing. There's no final solution yet.

But yeah, that's the elephant in the room question. Yeah, I'm glad you brought up WebMIDI because that was one that I think got a lot of press for Google implemented this. And Safari and Firefox are both using not to implement this. But then you said Firefox did come up with a solution there.

So do you think that Safari will reverse their position now that there's a change there? Or do you think there needs to be more, I don't know, the reason was what security of fingerprinting was the reason why Safari declined to add the MIDI support. So what do you think is going to happen there? Do you think that will eventually get reversed?

So what we saw works mostly with Apple. And of course, this is only speculating. It's very much a black box. But when a big company asks for something and they are in the position of saying to their users, hey, we are big company acts.

You can use our product in Chrome. And if you use our product in Safari, please switch browsers. This is of course something that no vendor wants. So they try to make the use case happy and possible on the device in question, which means sometimes going through extra hoops like Mozilla did.

Will this ever happen? We don't know. So a big company like Lego asking for this is obviously something different. But then also they of course do have a very strong presence on the iOS app store.

So you could say people who want to talk to Lego devices and who have in their schools, let's say a fleet of iPads, they could just download the iPad app for Lego and talk to the devices like this. So it's interesting. Yeah, definitely. Yeah.

So we talked to Eric Meyer who works at a Galia. A Galia is hired by all the browser vendors to implement specific features. And I thought the most interesting one was that they were hired by Adblock to implement CSS has into the browser because that would be super helpful for blocking elements that have ads. Is a lot of this stuff also driven by specific companies?

Like you said, Lego. But like you have any other examples of companies that say like, you know what? We need this API because not necessarily because someone wants to visit the URL and the browser and use this thing. But our app runs on Chromium and we need this API in order for it to work.

Yeah. So one of the big cases that we had was Adobe. And Adobe brought Photoshop to the web, which is kind of mind blowing in itself. The way Photoshop works is essentially they have a big, big swap file on their native application.

And in the swap file, they host things like mittmaps. So I didn't know what a mittmap is. So I guess most listeners probably would be happy for me to explain, at least try to explain what it is. So essentially, if you have images and you have images of different sizes so that you can zoom in, a mittmap is a way of just pre-calculating various resolutions of a certain zoomed image so that you can then very quickly change zoom levels.

So the maps serve for this and probably other purposes. And I've probably done the worst job of explaining it. But essentially, the requirement there is you need to talk very performantly to a file and frequently. So because whenever you make even one pixel change, all your zoom levels need to reflect this change because you want to zoom in and you want to see the change reflected everywhere.

Right? So what they needed on the web was a way to sort of replicate this behavior so that you could create a Photoshop image or open a Photoshop image, make some changes, and then be able to zoom in very quickly. And this eventually led to the creation of what we now have as a cross browser API called Origin Private File System. It was born under the name Storage Foundation.

So people who have been following our blocks have maybe first read the Storage Foundation block. And then eventually, they read the Origin Private File System blog post. And yeah, so this is where we could see this kind of behavior really driving cross browser adoption because it obviously a very, very big company on the web. And yeah, they needed the API.

They wanted to bring Photoshop to the web, and not just to Chrome, which is something that I always applaud. So if you build applications, you should be building web applications, so not applications that only run in Chrome. And that's me working on the Chrome DevRel team. So we actually want to get web applications, not Chrome, running in Chrome applications.

So yeah, we saw that Adobe probably pulled some strings behind the back of most curtains that people see. And they just somehow made Apple implement the Origin Private File System, OKFS. And eventually, this also led Mozilla to implement. And now we're in the situation where we have this new API.

It's a what-week standard, the file system standard, that implements essentially a very performant file system that is modeled mostly after POSIX APIs so that you could get synchronous access to write and read operations, which makes them very performant. It's hidden from the origin, or actually, hidden in the origin of the application that you're loading, which means the files that you create there are not exposed to the user-visible file system. And this means browsers don't have to implement things like safe browsing. So whenever you download an application from an application, your browser typically does something like virus checks and makes sure that you don't get a trillion horrors.

So this is called safe browsing on Chrome. Other browser vendors have other options. But yeah, mostly the idea is the same. We want to make sure that you don't download anything bad onto your device.

If these files are not discoverable by the user by just going somewhere on the File Explorer and seeing them, we don't need to run these checks. Because we can just be sure the application deals with these files alone. There's no way that anything else can get access to those files. And this is why the access to these files can be very performant.

Let's go through a couple more of these APIs. Because there's literally hundreds of them, ranging from a crazy idea someone thought of one day to actual ones that are going through the entire standards process. So one kind of in the middle that I've used a couple of times in my courses is the face detection API. And I don't think that's a browser standard or anything like that.

It only works in Chrome. But it's kind of cool to see that type of thing as being sampled. So do you have any other ones that are kind of interesting or some of your favorites? So face detection actually is a really interesting example that it could maybe briefly touch upon.

So first detection forms part of an API called shape detection API. The one thing that is exposed on the not behind it, just shipped is the barcode detection part. And we actually see appless implementing that as well, because it's considered to be a generally useful API. So face detection is also in this family of shape detection.

Obviously, if you look at devices like iPhones and headphones and so on, they tend to have face detection as part of the operating system. So the camera is able to spot where is the face so that it can zoom on you and stuff. And so today I listened to your show on AI, which obviously, AI face detection, face landmark detection, and so on is a very interesting computer vision task. So with many of these APIs, the question arises, wait, does this make sense as a high level API?

So should there be whatever detect face something, something API? And it does exactly that and nothing else? Or should there be a generic way of saying, hey, we give you the freedom to load whatever model in the browser and then run whatever camera image through it, and you get the detected faces and landmarks and whatever. Or should this be somehow be tied into what the operating system gives, which then shows another interesting problem, which is not all operating systems have the same level of support.

So if you think a Windows laptop, maybe face detection is part of the operating system, but maybe not for Android, it is. So where do you start? Where do you stop? It's a very interesting question to ask with many of these APIs.

So how high level should it be? How generally useful should it be? How should you just implement something that is super low level? And if you want to get to a higher level, you need to whatever pull-in some model that does it has for you from Hacking Face or wherever.

So yeah, it's an endless discussion. And I guess we don't really have an answer yet. So more recently, continuing to new interesting API proposals, there was a proposal from Intel for a blurring API. So looking at your image squad, so you have your background blurred.

I think you've got two. So this is part of the operating system in many cases. But of course, it's also something that you can do with segmentation. So you can just use a model to segment the foreground from the background and then apply some blur effect to it.

So yeah, Intel have proposed this as an API that is high level, but it can obviously be done as a low level API as well. So very interesting discussion. Yeah, I went down this exact route yesterday. I was trying to run a bunch of models in the browser, kind of seeing which ones are small enough and fast enough that you can actually run it in the browser at no cost.

It's just the CPU. And there's a really good library called transformers.js, which is being worked on by a dev at Hacking Face. And he has adapted many of the popular models to being able to run in the browser. I was really surprised, like, everything from image detection, elephant detection to.

Hot dog detection. Hot dog detection. And I did that one of my types of course, the hot dog detection, but also things like sentiment analysis and creating embeddings directly in the browser. And I went into the issues to see what people are asking for.

And it's not, oh, we need a browser-based AI like I can load a model into the browser and use it. Because I don't know if in three years we'll be like, well, the browser one sucks, you know? It doesn't work. What people are asking for is even a lower level one is just we want web GPU support for this library so that it'll run faster.

So it's kind of interesting. Like you said, where does the line draw of what API should we have? Or should we just provide lower level primitives, like a web GPU support that makes it fast as hell? This is sort of blurring the lines with the AI discussion.

So we have the web and M proposal by Intel, which obviously makes sense if you're a processor maker, that you want to get things onto the processor. But yeah, we also obviously have things like TensorFlow, TensorFlow JS, TensorFlow running in Basm. We do have, as you said, the higher level APIs. So something that iPhone, very interesting, is connecting all the dots, making very cool applications.

By combining all these different APIs. And if you want to swap the implementation, you could easily do so. And something that I'm very interested in is also, if you think global globally, downloading a model can be free for you if you are on a high speed internet. And I'm 50 max, 100 max, 500 max, whatever.

It's essentially just done in a second. Or it can be very, very expensive. But download cost is one thing. Storage is in the next, like, how many different, let's say, video image editing applications to use, let's say three.

And all of those would need to download some sort of model that you need to store somewhere. And ideally, store it in such a way that it's persisted, which brings us back to the origin private file system, and we could store such a model. But then, yeah, because recently, we had a lot more privacy focus on the web. None of these can be cached and stored in a way that they could be reused by different origins.

You have to, in the worst case, download the same model for every single applications that you're using. So that's another interesting problem there. So how do you make sure that your applications are useful and accessible across the globe with all these different constraints that you have on network, on device, power, on device memory? How do you make sure that this works?

Another option is, of course, you just say, hey, just bring your own AI key. You plug your AI key into the settings, and then you can use something that runs somewhere in a cloud. So you can get the same experience across the globe to everyone. It will be consistent because the cloud will determine how fast or not your model can run a not the device.

You can think hybrid, so you can see how performance is the device. We have an API for compute pressures. You could dynamically even see, oh, how busy as a processor. Is it running in the red zone?

Is it still green? Maybe if I notice I'm running suddenly in the red zone, I could then dynamically swap and say I go from on device to in the cloud. So there's a whole bunch of interesting ways to combine all these new APIs and just make applications at Berkeley, especially for everyone. Wherever they may be and whatever device they might be using.

I, CapCut, is a desktop video editor that's built in. I downloaded the thing. I was like, this is amazing. Like it's a full blown video editor.

Hmonsie, we've also had the DevOn from DScript as well. That's kind of similar idea. And I dove deep into the bowels of CapCut and you go into the library application support. And they have a lot of their models downloaded into your file system because it needs to run locally rather than upload the entire thing, process it, and bring it back.

Doesn't make sense for a lot of video work. That would be too slow. So I like seeing that. That's cool.

Wait, are you saying that they're in code? They have to download them. Where are you saying they did? Well, CapCut will download the models to your computer and run them locally rather than the opposite, which is uploading your video.

So when you want to use a new filter or something like that, you have to download the filter first and it takes a second to download it. I was watching where in my file system those were going. Yeah, Topaz functions the same way, although I don't think that's web-based. Here's a kind of a backwards question here.

On the Fugu tracker set, you'll see a lot of different differentiators for all these different platforms that these things may exist or work on. And PWA, progressive web apps, something we've seen everywhere. However, I'm also seeing IWA. You want to talk a little bit about what an IWA is?

I have a feeling like many people in our audience don't know what an IWA is and how it differentiates. Yeah, so IWA stands for isolated web app. So PWA is progressive web app, IWA isolated web app. Essentially, this is an idea where we looked at the use cases that some people have.

And if you think some of the use cases on the web are streaming other operating systems. So a lot of big companies like call centers, they run massive fleets of Chromebooks because they're cheap. And onto these Chromebooks, they stream Windows. So to the operators on the call center computers, it seems like for them, they're using Windows.

And of course, they want to make this as immersive as possible. So a Windows window running inside of a Chrome OS window, of course, is kind of confusing to many people. And in the first case, you end up closing the wrong window and when you wanted to close the inner window and close the outer window and so on. So there's a number of APIs that only make sense for these kind of use cases.

And essentially, the use case there would be, hey, we are this streaming company and we stream operating systems. And we want Chrome windows to be borderless so that we can show our streamed windows, windows, windows, windows in a sense of Windows 11, for example. And we want to stream those and make them appear as if they were the actual windows that the user is interacting with. And it so happens that a lot of these companies also have native legacy applications sometimes, a legacy in the sense of they wish to rather shut them down, but they can't because they still need to support whatever platform that doesn't support everything they need quite yet.

So they solve this problem of streaming on the operating system using some sort of socket on the web. We, of course, have web sockets. And we now have also web transport, which is kind of a more modern version of this kind of streaming socket-ish protocols. And so with the use case of streaming and operating system onto another operating system that runs a browser, we realized we would in the end need direct socket support.

So we're the operator, so the company that streams the other operating system, once they're streaming protocol to be socketed somehow into your browser. The thing is, this eventually would allow people to circumvent any and all firewalls that companies have set up because you could just draft any kind of message onto this direct socket protocol and tunnel through the company firewalls, which is something that is a big, big red flag. So in the end, we realized there's just a number of use cases that we can't realize, even if we wanted to, that we just can't realize on the open web. So progressive web apps where you just go to your well and then you're happy and run it, this just wouldn't fly.

So IWA is a proposal where we say we see the use case, but we can't enable this use case on the open web. But we definitely want to enable the use case to run in the browser. So it tends to be that this use case is something that a lot of people have that are on managed computers. So I said call center before.

So you have this massive call center with whatever thousands of Chromebooks that are all managed centrally. So the admin there could say, I want to pitch a push, an application onto all the Chromebooks in the fleet. And all these apps should have, let's say, the right to run the direct socket API. Like that, it's a centrally controlled thing that an admin can turn off.

They have a kill switch in the worst case. And of course, they trust the application. It's not just something that they found somewhere on the internet, but something that they got maybe from the vendor. So this is one way of enabling those.

Or we say on Chromebooks, typically you have something like the Play Store, you have some sort of store where you can get applications from. So this is the definition of having a bundled application that is signed. So everything that is in the bundle is signed. It has permissions manifest.

So it can only make a number of requests to a certain allow listed origins, for example. So this is another way of enabling this. And essentially the idea there is to have IWA's that people can install if they want to. But it needs to be a very conscious decision.

It needs to be a conscious, like almost traditional app install. So if you look at the UI that people have created on the design teams at Google for this, they even create a fake installed bar. So the TCO, there's progress. Actually, something is happening now.

I'm installing this application. But essentially, it's just built with technologies. Well, the direct sockets API. So I probably six years ago, I did a project where I controlled a DJI drone via the browser.

And that drone was connected via UDP, over Wi-Fi. Sorry, it was connected over Wi-Fi. The signals were sent over UDP. And that's one of the, there's TCP and UDP.

That's part of the direct sockets API. And I wasn't able, you can't send UDP directly from the browser. So I had to send my signal over a web socket to a node server. And a node can do lower level stuff like that.

So it'd be kind of cool. But I'm just looking at the list now. And there is a proposal for a join Wi-Fi access point in the browser. That would be really nice for a lot of these smart gadgets where you have to join the Wi-Fi network when you're setting up a smart plug or something.

That would be super handy to be able to do a view of the browser rather than to download every single app for everything that you buy. Oh, yeah, absolutely. The web transport API. I'm just looking at that now.

What's the use case for a web transport API? I see that it's for directly communicating with the HTTP server both ways. You catch me off guard there. So it's essentially a better web socket API.

But I don't know the details right now off the top of my head. But the way I was told at work is, yeah, it allows you to just send bidirectional messages in a UDP style. So better than the web sockets does at the moment. You have more degree of freedom.

But yeah, I'm not really into the integrated details there. So what I do know is that this is one of the APIs that the other vendors are interested in. This is one of the APIs that Mozilla and Fireful Mozilla and Apple have implemented. I think they haven't shipped it yet.

But it's something that they are on board with. It means the specification has been fully adopted by the W3C. It is on a true standard track in this particular case. Oh, it seems like you're able to do pooling, congestion, and unreliable connections.

So certainly if you have a spotty device or spotty network connection, probably that's a major pain with web sockets right now, whereas just those things are just lost. So this will allow you to better. This is cool. I had never heard about this before.

Yeah, it's one of the nice APIs. But you need to have to use this, of course, for many of those. Some of them are really niche. Some of them are useful, only for a handful of use cases.

But these use cases, if you have to use this, and you want to enable it on the web, of course, and it's crucial to have this particular API on the browser. Or in the browser. Yeah, I definitely implore anybody out there who is interested by this conversation at all to check out this Fugu Tracker. Because it'll get your wheels turning.

I have a, we briefly mentioned the Web MIDI API. But I have a MIDI device arriving at my doorstep in about an hour or two here. Really? No, I'll tell you what.

Well, I have a lot of many devices around the house but. Can you explain to someone who's not into music what MIDI is? I know it's like a, it's data, right? Yeah.

Yeah. MIDI is the musical instrument, digital interface. And basically, you know, you could think of sound and audio as being a waveform, right? Yeah.

MIDI is not that. MIDI is information. It's saying this note at this time with this pitch, with this level of intensity, how hard you're hitting the button or things like that. Either way, the MIDI interface for working in MIDI, it tracks data.

How long you're pushing down a key, what that key is, and again, the intensity of that key. So when you're writing music in MIDI West, you'll often see it being as like dots or lines on a piano scroll. OK. But those are essentially, you could think of it as you have your keyboard and you have all these keys and you could have key up events on any one of these keys or key events on any one of these keys.

Now, now you translate that to a like a drum pad or piano key or anything like that. And it's just a different interface of being able to register but things like pressure sensitivity as well as timing and length of being held and things like that. So I've seen a lot of really interesting examples. MIDI primarily used for music, right?

Yeah. But a lot of people are doing interesting things with interface, motion graphics. You know, Courtney and I were just watching the Big Taylor Swift concert, right? Yeah.

That's on Apple right now. And there's all this crazy intricate lighting setup where things and events are happening. But it's happening very much as if somebody is playing an instrument. And that's how that stuff is typically done besides being programmed for time.

If there's things that need to happen being live triggered, somebody's physically playing essentially a trigger like a keyboard. That could be done with the Web MIDI API. That's correct. Because better touch tool, that's the tool I use for Mommy keyboard shortcuts and window management.

They of course support keyboards, but they also have the support for MIDI. Meaning I can get a drum kit and redo. I can do it. Then it don't.

And when I do that, it would resize my windows. Yeah, you just did it in the air tonight. Did you just do that on purpose? Yeah.

I know it's my windows in a place. Yeah. That's amazing. Hey, another cool thing you could do with it overall is like, hey, just play music and record music on the browser.

Because if you think about it, every single like every single digital audio workstation, so you have like Ableton or whatever, every single powerful one of these tools has really great MIDI tools, including having MIDI support to interact with your keyboards. So that opens up a class of app. It's an input. I thought like, oh, like it does do it.

Like you can make a ringtone or something in the browser, but it's a very powerful input device, which can then control anything, likely music. Correct. Yeah. If you have the data, you can do anything you want with it.

Yeah. So that many is super cool. The Web Audio API is super amazing. Not an expert there either.

But what I've seen people do is they created all these super expensive 80s synthesizers in digital. And you can just play with whatever super expensive moves, something, something for free. But otherwise, it would cost you whatever, thousands of euros of dollars to run or even touch. And you can just do that.

And I mostly have no idea, dog, when I touch a synthesizer, but it's super fascinating what you can do there. So another quick story about hacks. So I learned that the Joy-Con controllers of Nintendo Switch can be connected to Bluetooth. And they expose their signal as a hit signal.

So human interface device. And luckily, there is a Web Hit API. So essentially, you wrote this driver that allows you to talk to Joy-Con controllers. So you could do things like just play games, obviously, with them, because the GamePad API allows you to do some things, but it doesn't have access to rumble signals, for example, directly at low level as the Joy-Cons have.

You have, of course, the gyroscopes. You can have orientation things like you can imagine playing like a maze game. And as you're tilting your Joy-Con, exactly, you would play the game and roll the ball accordingly. And I released this, put it on my GitHub.

And someone picked it up and turns out there's a whole bunch of access stores that you can buy for Switch. And one of this is like a fitness ring, something. So you would punch this ring or whatever. And I had a PR where someone just added support for this.

And then it turns out in Japan, they have a different version of the Switch. So someone just sent a PR in of the Japanese version of this. And one day, I almost had forgotten about the project, someone pings me out of the blue and turns out this person is an artist. And this brings me back to what he said about music.

And they just use the Joy-Con controller to make music, essentially. So they were dancing, they were doing a dance performance on stage holding two Joy-Cons in their hands. And the music would just adapt to whatever they were, movements that they were making. And all of this just powered by the web.

I don't mind what's blowing when I saw that. It's like the power of open source, right? So you release something stupid. Someone else builds upon it, makes it better.

You don't even have the hardware, but you just trust them because it's open source and they look friendly. And it's like a very, very nice community, really. And yeah, sure, I granted them access to the repository, made them admin to the repo. And they just suddenly made the thing better that I released for essentially a demo article.

And now probably some one summary is using this. And I don't even know what they're using it for. But I think this is really, really cool about Project Fu. Would that be really allow anyone to just take these APIs, build amazing things with them, useful, use less fun, serious.

So I'm really mind blown about the opportunities. Yeah, it really, so I went to school for a lot of this stuff. And some of my classes included people making a table tennis game that would play music when you played table tennis with MIDI triggers. And I did a projection-based synthesizer.

So if you stepped into various beams of light, it would play sound. And I've had all these interesting ideas around artistic creation. But since moving into web programming specifically, it's like all that stuff goes out the window because the tooling is straight up isn't there. But this conversation has really opened my brain into like, oh man, I have all these old projects.

For instance, like you said with the Joy-Cons, I had one that was a We-Mote and used the accelerometer data when you hit the button and whatever, it would deconstruct music. And so, man, you could do so much stuff. And if you just scroll down this Fugu API tracker list, if you're looking for something to get inspired by. Man, there's a lot of stuff to get inspired by on here.

So maybe we should talk about some of the more serious things that people build. We've been talking about all the fun things. Yeah, yeah. So if you think any kind of editor experience, what do you need?

You want to open files? You want to save files? You want to just press Command S and be sure that whatever you just did is saved on disk and not download it as yet another copy of the thing that you were working on? You are becoming thinking, sorry, you start thinking in the mental model of this is an actual application because with Fugu APIs, you can now create an app that is a PWA that people can install.

And once it's installed, it can become a file handler. So you can say, I built an application that handles whatever.bird files. Hopefully, this doesn't exist. So.bird.

And you can become in Windows, in macOS, whatever. On the File Explorer, you can become the default file handler with your PWA. So people could double click a.bird file. Your PWA would open.

The person would have straight editing access as usual to this file. You would even have to go through a download and save dialog. But just once you have opened the file through this mechanism through the file handling API, you have a file handle. So you can just allow the person to use a keystroke to save the changes that they've made.

We have to drag and drop API that existed for a long time. But with the file system integration, you can also say I drag and drop a file onto something that is an editor. I make my changes. And it's saved straight back to the file that I just dropped.

So super powerful flows that people forget. And that's the best about it, really. So once you get into this, oh, I just work in a PWA. And I don't even realize that this is web.

I think this is where the power really, really shines. And I'm very happy that Maca Sonoma now enabled people to install any kind of experience as a desktop web application on Mac. And I think now that people do that, they will start seeing some of the use cases that the Chrome team have addressed a long time ago with some of these Google APIs. Because yeah, if you have, whatever, this morning I was texting with Jim Niels, Niels, and I think on Gmail.

And he has the Gmail application installed as a PWA in Mac. And he realized, oh wait, it would be so much nicer if only there were a way to have the badgering at K-Works so that you could see how many unread emails you have in your Gmail application right on the dock. I looked at the thing, and I remembered, oh yeah, on Maca Sonoma, Apple made the choice to not enable extensions. So because Gmail doesn't allow or doesn't support the badger API quite yet, there's an extension, of course, where someone hacked it.

So you could patch Gmail with an extension so that it would allow you to run the badger API commands whenever it sees a new unread email comes in. Yeah, Jim essentially found out, oh, if I just do the same thing in Safari and I just paste the commands that the extension uses into the DevTools console or a web inspector, it's called in Safari. I can get the same workflow. But of course, yeah, this is being Jim.

I don't want to get an extension, maybe. Or they just expect the application to work like that to begin with. So if Gmail ever becomes truly a PWA that makes use of some of the more advanced features, people would just expect, oh yeah, if I install Gmail, I will have a batch of unread emails that shows me I really better stop talking and go back to my email rather than whatever, hanging out on part-outs or something. So I think this is really, really with the power of the web here.

And by having more competition in the sense of, yeah, other browser vendors supporting those, I think people would realize these use cases exist. And I want to go and, Safari use a double click, a .bird file, and have to Safari web app open and not be forced to install Chrome just to get whatever, the bird PWA that runs bird files. Oh, so I just, I didn't even realize that you can now, Safari has Add to Doc on Mac OS. Like they've had Add to Home screen on iOS for a while.

But now on the syntax website, we could throw a little icon, and there's a new podcast to listen to. We could update the number of unlistened to episodes. Or there's a lot you can do with that. That's pretty cool.

And the nice thing is you can make this without even the app running. So the service back would pick up a push notification, and then you could update the badge right from the service back. So it's super cool what you can do these days. Oh, no.

Yeah. Anything else we want to talk about before we move into the next section here? Oh, yeah, sure. So one API that I'm super excited about that is coming soon is the File System Observer API.

So the idea there is you grant access to a directory or a file on your hard disk, and you allow the PWA to observe changes. So why is this useful? Because you can imagine a photo supplication that has a cloud backup function to then just monitor your photos directory on your hard disk. So whenever you drop in your photo, the PWA will just silently sync your photo directory to the cloud.

You could also imagine things like image optimization. So you have an output folder where you just dump your screenshots, for example, that you take. It will just automatically run them through a pipeline that optimizes your images, reduces the file size, adds the watermark or something. So File System Change Observer is really, really cool.

It's coming soon. It's implemented behind a flag. If you want to test it, you can. But yeah, I can't wait for this to hit more people's actual browser so that they can see how useful this API can be.

Yeah, you could build a whole suite of tooling. So I wrote a little Apple script that whenever a Hike HEIC file hits my downloads folder, then automatically converted to a JPEG. Because if I'm sending it for my phone to my computer, obviously I need it in JPEG format. And I tried writing that in Apple Script, JavaScript.

Apple converted all of Apple Script to JavaScript. And there's nothing out there. It's impossible for me. I was like, I just want to npm install something and have this thing run.

And I could not figure it out for the life of me. So I went back to using the Automator, Dragon, Drop interface. But imagine you could just have a whole bunch of nice little scripts for that type of thing and watch it. Of course, you can do that with Node.js now, but you can install it via the browser.

Yeah, so this would be a perfect use case. You could just have this Hike converter and it would just magically run whenever it detects a new whatever file. And I'm converted to something more readable. And yeah, you could just have this.

And when you switch computers, you could just reinstall it. It would run. It would even be a cross-platform. So your Apple Script would run magically on Windows.

Because it's not Apple Script, it's just a web application that you can just install and be good. And this is going to be added to Chrome, meaning that a Mac user will be able to use this. So this is coming to Chromium, which means Chromium verivates like Chrome, obviously, can enable or not this API. Obviously, on the Chrome team, we do enable it.

But other Windows, like for example, Brave, they might have a different stance on this API. It's coming to Chromium and it's coming to Chromium on desktop first, which means whenever you have an operating system that has some sort of change observer, it will be somehow translated to something that is web. An interesting case there is different operating systems have different levels of details that they report. So some operating systems will just say, oh, there has been some whatever change in the directory.

So you'd have to then pull the directory and see what has even happened there. Has a new file been added, has one been deleted. Other operating systems will give you a more detailed change notification to tell you, oh, a new file was added to whatever example is directory, which means on the web, we need to work with whatever is the lowest common denominator on all operating systems. But even though it improves upon the status quo, because in the worst case, what you could do today to get something like this happen, you could just pull frequently the directory and see is there a change.

But of course, if you are VS Code.dev, one of my favorite people, this doesn't scale, right? Because in the worst case, you get pull on the console and then you continue working with an outdated file in the IDE on the web, which is something that, of course, you want to avoid. I've in the past had really bad data loss of that because of that, because I had an editor that was native by way back in the days. That just didn't realize when something was changed out of band.

So on the web, of course, this is something that we want to avoid. And yeah, the first is some of the observer API will allow these kind of workflows. And I'm pretty excited about this launching. Hopefully, really, really soon.

Wow. Yeah, that's amazing. Well, I'm excited about that one. And we just did a whole show on observables talking about some of the new observer APIs.

And it looks like this is the same observable API as well as there's another pressure. It is exactly the same. Yeah, exactly. I think this one of the oldest that uses this panel.

Intersection of the presentation. Intersection of the presentation. Yeah. So it's just the five systems of server.

And it's nice. It's awesome. It's super nice. Wow.

Wow. Wow. Yeah, the browser. Growing up, I'm excited.

And I honestly think Apple's going to bend on a bunch of this. I think we're doing our prediction show after that. And that's one of my predictions is that we're going to have a couple of Apple's been kicking it lately. And I bet we're going to start seeing some PWA stuff land, not all of it, not how you expected it.

But I feel like that this is the year 2024. I knew I was so true. Let's move into the last section here, which is a sick pick and a shameless plug. You can compare it with either of those.

All right. So my sick pick is laser cutters. I've been working on a bird clock recently. Well, recently it was the 40th birthday present for my wife.

And she got it when she was close to 42. The reason was I had everything running, the electronics, the auto, you know, everything. The one thing that was missing was cutting out the letters. So I tried everything from cardboard to plastic to, you know, carpet knives and everything.

It just didn't work. I bought a scalpel like a medical scalpel. It just wasn't feasible. Yeah, that's a hard one.

At one point, I remembered, oh, at Google, we have maker spaces and in these maker spaces, there's laser cutters. And I just send an email to the MISC mailing list of the Munich office in Austin. Hey, I have this SVG and I needed a laser cut, but I have no idea how a laser cut works. And of course, someone picked it up immediately.

We got it working. I was mind blown when I saw the thing do it's cutting in perfect accuracy. It's so satisfying to just watch your plywood become a plywood with letters. And yeah, so my stick pick is laser cutters and I looked up, looked them up on Amazon.

And it can get cheap ones for 200 euros ish. So I wonder how good are they or not? So if anyone on the show has bought one of those, please let me know if they're any good or how much I should be spending. It's Christmas.

I don't really, really need one, but still it's a stick pick. So maybe I can just stick pick myself one. Who knows? A lot of maker folks who listen to the show, people doing three printing and all kinds of stuff.

So I bet there's some opinions out there. What about shameless plugs? 3D printing, Web Serial is typically something that works for a 3D printer. So there's a bunch of 3D printers that used a Web Serial API.

Yeah, so shameless plugs. I built this website called howfugu is my browser.deaf. I think it hasn't gone viral enough. So the thing is, it tells you essentially with a percentage bar how fugu your browser is.

I suggest so you get a percentage and you can compare, you can share it. And I think the best about it is it's like the fugu a guy tracker list of all the APIs, but you see immediately which of those APIs are supported on your browser. So that's howfugu is my browser.deaf. It has sharing integrated so you can share your results.

It will create a screenshot automatically for you. It has hashtags. So if people want to search hashtags, they can do so on Twitter, which I don't use much anymore, but master on too. So I think this should go more viral.

So this is my shameless plug. Howfugu is my browser.deaf. From Canary, 86% fugu. Firefox developer, 24% fugu.

Hold on, let me check safari before I let you go. Last attack, it was 26%. Do you think it's going to be more or less than Firefox? I would say it's going to be less.

Thomas, you guess. Do you think Safari is more or less fugu than Firefox developer? Oh, wait. So you're testing desktop, I guess.

Also desktop, oh, this brings me maybe to one more thing in true Apple manner. So one of the ways that Apple has realized, one of the use cases, which is the Poshan notifications API, they only ever enable it if you install an application to the dock or to the home screen, which means if you run Safari, howfugu is my browser.deaf will tell you what the push API is not supported, but if you install howfugu is my browser.deaf as a PWA or as a dock web app, whatever, or as a home screen web app, and you run it again, you will notice that now the API supported. So last I checked on mobile running in the browser, it reached 26%. And I think installing it made it go to 27, but I might be wrong.

I'm also on all this latest beta, which means I'm not super representative of what regular iOS users or iPad users have. But yeah, you tell me what is your result running on? So stable Safari, 17.2 is 27% fugu, and then if I install to dock, it bumps it up to 28%. And I can't open technology preview because I haven't updated it and they probably want to read.

Awesome. Thank you so much for coming on. I appreciate all the insights and all the cool stuff. It's exciting.

I want to make a million things after this. So let us know what you the audience have built to this at syntax FM. Yeah, thank you very much for having me. This was a lot of fun.

Thanks for the fun discussion. I'm, yeah, can't wait to see what your listeners will build with all these APIs. Awesome. Thank you so much, Thomas.

Great. And I'm over to syntax.fm for you full archive of all of our shows. And don't forget to subscribe in your podcast player or drop a review if you like this show.

No similar episodes found.

Kaizen Blueprint Aldo Chandra "Kaizen" is a Japanese term for continuous improvement. This podcast provides a blueprint to learn about health, wealth, relationships and everything else in between. Through our podcast, we strive to inspire, educate, and motivate our audience to cultivate a mindset of lifelong learning, productivity, and personal development. By sharing insights, strategies, and practical tips, we aim to guide listeners on their journey towards realizing their fullest potential, fostering success, and creating lasting positive change. Chewing the Fat with WorkForge WorkForge Bite-Sized Conversations for Building a Stronger Workforce Welcome to Chewing the Fat, a podcast delving deep into the world of food manufacturing. Dive into real conversations around critical topics like staffing, retention, onboarding, and career development in this essential industry. Subscribe now to gain insights from your peers, subject matter experts and more on the biggest issues facing food manufacturers today: -Hiring and retaining employees -Addressing the challenges of the Silver Tsunami -Improving time to productivity of new employees -Engaging employees from hire to retire And more... Tune in to Chewing the Fat, a WorkForge podcast, and join the conversation on how to build and sustain a resilient, high-performing workforce in food manufacturing. Darknet Discussions Darknet Discussions Welcome to "Darknet Discussions," the podcast that gets into the shadows of the internet to bring you the most intriguing, enlightening, and sometimes unsettling stories from the dark web. Hosted by seasoned darknet aficionados, each episode of "Darknet Discussions" explores the intricate dynamics of darknet markets, cybersecurity threats, and the digital underworld. Join us as we interview experts, discuss the latest trends in cybercrime, and shed light on the technologies that operate beneath the surface of everyday internet use. Also, we occasionally go off on a tangent about something completely unrelated. The Protocol CoinDesk Dive deep into the blockchain realm with The Protocol Podcast, where we unravel the intricate technologies powering cryptocurrencies like Bitcoin and Ethereum. Join us on a journey through the labyrinthine layers of blockchain innovation, as tech-savvy developers sculpt the future of finance and the decentralized web. Led by CoinDesk's adept journalists, we dissect the freshest news and project revelations, demystifying the mechanics and significance of it all for those hungry to grasp the inner workings of this dynamic and rapidly evolving industry.Meet your hosts: Brad Keoun, Sam Kessler, and Margaux Nijkerk…and tune in, techies!

Frequently Asked Questions

How long is this episode of Syntax - Tasty Web Development Treats?

This episode is 1 hour and 2 minutes long.

When was this Syntax - Tasty Web Development Treats episode published?

This episode was published on January 26, 2024.

What is this episode about?

Thomas Steiner talks with us about Project Fugu, an effort from Google to enable new classes of applications to run on the web. What is Project Fugu? What are some of Thomas’ favorite APIs to use? What is an IWA vs a PWA? And more! Show Notes ...

Can I download this Syntax - Tasty Web Development Treats episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!