Let’s build the future of 10 billion humans and bots in harmony – full of radical experience and social bonding – as Artificial General Intelligence (AGI) arrives in 2020s.
Let’s focus on the G. While AGI will be unevenly distributed the next few years, the Gen (generative) AI with large language model (LLM) is already here. We are not just the generative or general AI – but the Generation AI. To prioritize, follow the wisdom of market-product-team fit: generative agents, $1 fees, and twice daily. That is, are we riding the 1000x wave of the decade? Are users paying for what they ask and deserve? Do yourselves use the feature as often as you use a toothbrush?
Our first product is a Voice AI mobile app for real-time conversations on a particular domain. Our key features are 💬 interruptible responses for conversational turn-taking, 🔊 model instructions for chat-style feedback, and 🌊 screen-free focus with voice immersion. Better than Call Annie, “Hey Sam” supports ChatGPT4 (with 32K context), 100+ native languages, and emotion-varying voices.
- Problem: Use AI without screen time, keyboard input, or control menus.
- Solution: Speech recognition + ChatGPT4 + Speech synthesis as iPhone app.
- Technology: Speculative generation and on-device model for sub-second latency; interruptible conversations and emotion-varying tone for voice immersion.
- Customers: 30-min daily practice for language learners, or hour-long subject expert Q&A during commute.
Until 10K daily users – avoid adding any features: voice choices, chat transcripts, visual feedback, extension plugins, custom models, web fetches, user data, file uploads, multi modals, preference settings, keyboard input, on-device optimizations, Android native, or Web interface.
All user interactions are through voice for singular immersion – including switching between languages, adjusting speaking speed, changing personalities, using age-appropriate vocabulary, picking the right length of responses, connecting with different knowledge models, pruning context window length for cost, or selecting character tones.
No action buttons for start, pause, stop, or reset. Always listening and speaking on microphone while the app is active. Keep the chat context across multiple sessions until the app is reset with swipping up. Use iOS 17’s Action Button with the phone hardware side click for quick sessions. Free uses for 60 minutes. No setup of accounts or payments – Apple’s Pay will pop up instead as credits run out.
The product only has one full-screen display as dashboard – without any user control or interactions. The dashboard shows only the metrics of input voice decibel, system latencies, output voice volume, elapsed usage time, remaining payment credits, and a 64-bit session identifier with dash-separator. The overall system at center and 3 component latencies, measured as 5-second running averages, are for optimizing network and API performance of speech recognition, ChatGPT4 response, speech synthesis.
Learn more on x.country.