Napster Station kiosk with faytech: VoiceField mic array + embodied AI concierge, Napster View 3D

Napster is being reframed here as a “streaming expertise” product: a library of domain AI companions you meet in real-time video instead of text chat. The demo focuses on embodied agents for tech support, fitness coaching, and personal guidance, plus digital twins that can mirror a real person and optionally escalate to a live call. The pitch is simple UX: talk naturally, keep context, and let the system handle the tool-wrangling under the hood. https://www.napster.ai/view

On desktop, the centerpiece is Napster View, a small clip-on display for Mac that uses a lenticular multi-view optical stack to create glasses-free stereoscopic depth, so an agent appears “above” your main screen and keeps eye contact. The team describes combining a custom lens with rendering tuned for multiple viewpoints to keep parallax consistent and reduce visual fatigue, with USB-C power and a low-cost hardware entry point. The footage is shot during CES Las Vegas 2026, where spatial UI for everyday computer work is turning into a practical form factor.

Software-wise, View is paired with a companion app that can see you, and—when you grant permission—see what’s on your screen for situational awareness. That enables screen-guided help (for example, learning macOS app workflows quickly) and artifact generation like emails, plans, or images from what the model observes. They also preview “gated” control of macOS actions (launching apps, manipulating documents, editing media) with extra testing and safety checks, because automation shifts from advice to execution mode.

The same conversational layer is used for generative media: you pick a genre and scenario, and an AI “artist” produces lyrics, cover art, and multiple song variants, then returns them through the UI as shareable assets. The transcript stresses a model-agnostic approach—swapping underlying LLM or music models as they improve—so users don’t need to track the fast-moving ecosystem. It’s a clear example of orchestration: multimodal input, structured outputs, and lightweight creative iteration in one place.

For public spaces, Napster Station extends the idea into a kiosk: camera-triggered interaction plus a near-field microphone array meant to isolate the voice of the person directly in front, even in loud environments. The pitch is “AI outside the browser,” where an embodied concierge can drive existing web surfaces (retail, airports, hotels, venues) by taking a spoken intent and executing steps like a digital employee. Technically it’s a blend of UX, audio DSP, vision, and agent workflows tuned for a crowded trade-show floor.

I’m publishing about 100+ videos from CES 2026, I upload about 4 videos per day at 5AM/11AM/5PM/11PM CET/EST. Check out all my CES 2026 videos in my playlist here: https://www.youtube.com/playlist?list=PL7xXqJFxvYvjaMwKMgLb6ja_yZuano19e

This video was filmed using the DJI Pocket 3 ($669 at https://amzn.to/4aMpKIC using the dual wireless DJI Mic 2 microphones with the DJI lapel microphone https://amzn.to/3XIj3l8 ), watch all my DJI Pocket 3 videos here https://www.youtube.com/playlist?list=PL7xXqJFxvYvhDlWIAxm_pR9dp7ArSkhKK

Click the “Super Thanks” button below the video to send a highlighted comment under the video! Brands I film are welcome to support my work in this way 😁

Check out my video with Daylight Computer about their revolutionary Sunlight Readable Transflective LCD Display for Healthy Learning: https://www.youtube.com/watch?v=U98RuxkFDYY

source https://www.youtube.com/watch?v=RN8xqMVZ7aE

ARMdevices.net

Napster Station kiosk with faytech: VoiceField mic array + embodied AI concierge, Napster View 3D

Categories

Charbax's other sites