The Universal AI App?
Cursor and similar IDEs are close to the ideal universal interface for AI interaction, at least for programmers, but currently, its capabilities are limited primarily to text and image generation. By incorporating video, real-time audio, and mobile-friendly interfaces, we can evolve the cursor into a truly universal interface for AI interaction.
The Vision
All personal computing built on top of a few core media types.
- Text processing and generation
- Video handling (both stored and real-time)
- Image manipulation and generation
- Real-time audio processing and voice input
- Mobile-first and wearable interaction patterns (glasses, watches)
This vision particularly impacts creative tools and workflows. Traditional design tools like Figma may face disruption as AI enables technical users to directly translate their ideas into designs. The distinction between technical and non-technical creators begins to blur, as AI-powered interfaces make sophisticated design and development capabilities more accessible to those with domain expertise.
Going (realtime) multiplayer
While current AI-powered IDEs like Cursor excel at single-user interactions—essentially creating a duel between programmer and AI—many creative workflows demand richer collaboration. The Git-style collaboration model, though effective for code, doesn’t translate well to more fluid creative processes like design critiques or brainstorming sessions.
The platform needs to support:
-
Real-time Creative Collaboration:
- Synchronous multi-user workspaces
- Fluid feedback loops for creative processes
- Context-aware collaboration modes for different disciplines
-
Organic Data Generation:
- User corrections and feedback become training signals
- Natural workflow produces high-quality training data
- Domain experts’ interactions create specialized datasets
-
Collaborative Knowledge Building:
- Social motivations (community recognition, status)
- Professional growth opportunities
- Optional monetization for specialized expertise
-
Quality Control Systems:
- Peer review mechanisms
- AI-assisted verification
- Expert validation workflows
The core remains a dialogue between users and AI, but now operating in a truly multiplayer environment where artifacts (code, media, documents) can be collaboratively refined. Modern AI systems have established the foundational patterns—the next step is making them work at scale across teams.
Building a Self-Improving Ecosystem
The key to making this system truly powerful lies in creating a self-reinforcing, self-improving loop. This requires:
- A developer and power-user friendly toolset
- Deep system-level integration possibilities
- Incentive structures that reward continued engagement and improvement
- Model Specialization Management: As AI models become increasingly specialized, the platform must:
- Handle varying model capabilities and limitations effectively
- Provide seamless routing to appropriate models based on task requirements
- Manage feature enablement/disablement across different models
- Cross-Model Standardization: Creating consistent interfaces across divergent model capabilities
- Voice-First Integration: Emphasizing voice input as a primary interaction method for broader consumer adoption
Data Quality and Trust
To maintain high-quality interactions and trust in the system:
- Expert Contributions: Special workflows for domain experts to contribute specialized knowledge
- Feedback Loops: Systems for users to improve and correct AI outputs
- Knowledge Attribution: Clear tracking of knowledge sources and contributors
The Social Layer
The platform’s social components will be crucial:
- A marketplace for tools and extensions
- Collaborative workspaces
- Multiplayer AI interaction capabilities
- Real-time collaboration features
Product Architecture
The ideal implementation would combine:
- Cursor-style interface mechanics
- Real-time collaboration capabilities
- Flexible multi-model integration
- Marketplace functionality for:
- User-created agents and tools
- Model deployment and feedback
- Preference data collection and comparison across models
- Specialized model deployment with clear capability definitions
- Developer-friendly APIs
- Cross-platform presence (mobile, wearables, desktop)
- Voice-first interaction layer for consumer applications
Data Storage and Processing
Key considerations for the system:
- Flexible “flat file” storage that avoids rigid structure
- Ability to restructure and represent data in various modalities
- Context-aware output formatting based on user needs
- Seamless integration across devices and platforms
User Experience Focus
The platform emphasizes:
- Professional-grade tools for power users
- Highly polished, intuitive interfaces
- Voice-first interaction capabilities
- Multi-modal input/output support
- Trust through transparency and open-source elements
This creates a universal platform that serves both developers and no-code builders, providing a comprehensive ecosystem for AI-powered creation and collaboration.
Related
Raw
I'm going to try and summarize the startup idea the way I did yesterday. Basically, there's a marketplace where users utilize models, and the benefit they get is the use of multiple models. It's kind of like a professional-level tool. Part of this is a marketplace where really good users can create agents and deploy them for others to use. On the lab and model builder side, they receive feedback for preference data, both within their model and in comparison with other popular models. The open-source element of this would largely be trust, but the emphasis should really be on a highly polished user experience—kind of like a much better take on pull. From my experience driving across the country using AI, I've realized that what will make AI really powerful for consumers, in particular, is voice input. Very multi-modal input on whatever device they have, like glasses or a watch, is crucial It's about trying to be on every platform to be part of the user's life. This is where all input or output becomes relevant, and output should well-organized information in the right modality for user processing. Intermediate storage should be almost like a fact file for the user because when you start imposing structure, it detracts from the point of AI, which is its ability to restructure data in various ways. I think those are the thoughts I have so far, and I can put it together like a PRD or pitch memo for the product after this.
Another observation about this AI marketplace that I just talked about is that as models start to become divergent, they become good at different things, and it makes sense to take features out. For example GPT-4.0 is a general-purpose model at OpenAI, and the features it doesn't have... It doesn't have very good reasoning, but the reasoning model doesn't have access to the web or images. And then certain models like... I forget what it is. Anyway, we're getting to this point where even within a company, they're having to enable and disable features because they can't get one model to support everything. And then of course, you can extend this problem across companies. What you can offer is a bunch of standard interfaces that are really good. Like we talked about artifact-centric computing, and I think that theme is relevant here. But... No but, that's all for now.
So, the thing I'm thinking of building in software, I believe, can best be described as a collaborative pro suite for AI for people, almost a long tail of people using AI. It's different from Perplexity or U.com, which are very much targeted at consumers. I think even O is targeting too low of a bar, and I think we're getting to the point where models like O3 are more appropriate. Basically, the audience is the people who are willing to pay $200 for O3 because that's just how much these models improve them at their jobs.
I mean, I said creativity seems to be the way to go here. Why not product design and software design? Figma is probably at risk here because its audience is nontechnical but "technical" users are far more enabled.
In addition to the benefits I've laid out for a collaborative creative pro tool for AI and artifacts to work together on artifacts, the big thing that any of the single AI companies just won't able to do is multimodal. So, incorporating something from Claude and something else from GPT to have the best outcome, kind of like Cursor but better, seems to be the way to go here.