Napster Station 2.0: Transparent microLED AI Concierge Kiosk, VoiceField mic array, Azure AI

Posted by – February 12, 2026
Category: Exclusive videos

Napster Station 2.0 is a bundled “embodied AI concierge” kiosk concept that tries to make voice + video agents feel like a practical front desk for retail, hospitality, and public venues, not a demo that only works in a quiet lab. The idea is consistent brand behavior across channels: the same agent logic can live on a website, in-app, and in a physical station, with deployment framed as consumption-based “digital labor” rather than a big bespoke integration. https://www.napster.ai/

What makes this build technically interesting is the hardware stack being treated as part of the AI product: a transparent microLED touch display (AUO) paired with a high-end embedded compute module, a 48MP-class camera, and a beamforming microphone array tuned for near-field capture. In the booth conversation you can hear the engineering focus on noisy environments: voice isolation, face/pose tracking, and lip/mouth movement cues to improve diarization and reduce pickup from bystanders, plus tighter tuning of gain staging, echo cancellation, and latency.

On the display side, the station moves from transparent OLED to transparent microLED in a sub-millimeter pitch range (described as about 0.66 mm) to push higher luminance and better see-through characteristics for an “object-behind-the-screen” effect. It’s the kind of panel where optical bonding, cover glass, and PCAP multi-touch matter as much as pixel tech, because reflections, parallax, and touch accuracy define whether it feels like a usable interface or a showroom trick. The transparency also changes interaction design: you can keep eye contact through the screen while still using it as a UI canvas.

The software story is equally “stacked”: Napster positions it as multi-cloud and partner-friendly, with Microsoft Azure AI Foundry mentioned for real-time voice/video agent behavior and the ability to run on different hyperscalers (including Gemini, per the discussion). In this demo, the agent “Kai” isn’t just Q&A; it can branch into multimodal content generation (e.g., creating a short, shareable song with lyrics + audio), then hand off via QR for retrieval and sharing, which hints at a broader workflow engine behind the avatar.

This video was filmed at ISE 2026 in Barcelona, where the station is shown inside the faytech booth context as a prototype moving toward lighthouse customers. The most convincing part is not the avatar animation, but the systems thinking: sensor fusion (camera + mic), low-latency streaming (websocket-style), voice UX for public spaces, and a display architecture that lets the kiosk live in the middle of a room without visually blocking it.

I’m publishing about 75+ videos from ISE 2026, check out all my ISE 2026 videos in my playlist here: https://www.youtube.com/playlist?list=PL7xXqJFxvYvjUiepj5jbL6aIt6QB9jeCk

This video was filmed using the DJI Pocket 3 ($669 at https://amzn.to/4aMpKIC using the dual wireless DJI Mic 2 microphones with the DJI lapel microphone https://amzn.to/3XIj3l8 )

“Super Thanks” are welcome 😁

Check out my video with Daylight Computer about their revolutionary Sunlight Readable Transflective LCD Display for Healthy Learning: https://www.youtube.com/watch?v=U98RuxkFDYY

source https://www.youtube.com/watch?v=Fq8c-fZbpaI