The problem I couldn't stop thinking about
I've moved apartments three times in four years. Every time, the same paralysis: I'd stand in an empty room, try to picture what furniture would go where, fail completely, and end up buying the same safe beige things I always buy.
Interior designers exist, but they cost $150–$500/hour and require at least a week of back-and-forth before you see anything. Pinterest boards are beautiful but useless — they're not your room. And those AR furniture apps that let you drop a single couch into a photo? Still not enough.
What I wanted was simple: show me what this specific room could look like, redesigned, right now.
Turns out vision AI can do this. It took 23 days to build it properly.
The technical approach
The core insight: modern vision-language models don't just understand images — they can generate detailed, style-specific redesign descriptions that, when fed back as prompts to an image generation model, produce photorealistic output. The trick is the pipeline, not any single model call.
Here's what actually runs when you upload a photo to Inhabit:
- Upload & store. Your image goes to Cloudflare R2 via a signed URL proxy. R2 is S3-compatible but 10× cheaper and has zero egress fees — critical when you're generating multiple high-res images per user.
- Room analysis. The uploaded image is passed to a vision model with a structured prompt. It returns a JSON description: room type, existing furniture, lighting conditions, approximate square footage, current style. This grounding step is what makes the redesigns coherent instead of hallucinated.
- Parallel concept generation. Three concurrent API calls — one per design style (e.g., Scandinavian minimalist, warm maximalist, industrial loft). Each call gets the room analysis plus a style brief plus the original image as context.
Promise.all()keeps it fast. - Results page. The three concepts render side-by-side with descriptions. The page is shareable — each room gets a permanent URL so you can send it to a partner, a contractor, a parent who won't stop asking what your apartment looks like.
The stack (deliberately boring)
I wanted to ship fast and not get distracted by framework decisions. The stack is aggressively simple:
- Backend: Express.js. No ORM, just raw
pgqueries against a Neon serverless PostgreSQL database. Fast cold starts, zero config scaling. - Frontend: Server-rendered HTML with inline CSS. No React, no build step. The landing page and results pages are plain HTML files served by Express. This was the right call — the app has one core flow, not a dashboard.
- AI: OpenAI-compatible API for vision analysis and image generation. Routed through a proxy that optimizes model selection per task type.
- Storage: Cloudflare R2 for all images — uploads and generated concepts. Objects served directly via CDN URLs.
- Hosting: Render. Auto-deploy on push to main. No ops.
The total infrastructure cost at current traffic is under $20/month. The AI API costs are usage-based and scale with users, which is the right model for a product still finding its footing.
What actually took the longest
Not the AI integration. That part took two days once I stopped overthinking the prompts.
The hard part was handling the latency gracefully. Generating three concepts takes 15–45 seconds depending on the models involved. That's a long time to stare at a spinner. I went through four iterations of the loading state before landing on something that felt intentional rather than broken:
- Streaming status messages ("Analyzing your room...", "Drafting concepts...", "Rendering final designs...")
- A progress bar that moves at a pace calibrated to actual API response times — not a fake animation that finishes before the work does
- The original photo stays visible during generation, so the page never feels empty
Small things. But the difference in perceived quality is significant.
The other time sink: prompt engineering for consistency. Early versions produced concepts that technically answered the prompt but felt disconnected — a chair floating in a void, a window in the wrong place. The fix was adding the structured room analysis step first. Now the model has a grounded understanding of the physical space before it starts redesigning it.
What surprised me
"People share the results page more than they use the product."
The shareable room URL was a last-minute addition — I almost cut it to ship faster. It ended up being the most-used feature. Turns out people don't just want to see their room redesigned; they want to show someone else. Partners, roommates, parents, contractors. The permanent URL turns a private tool into a social object.
The second surprise: the quality bar is higher than I expected users to demand. Early testers tolerated bad generations during alpha. Real users don't. The first batch of real signups immediately found cases where the AI misidentified the room type or applied a style that clashed with existing architectural features. That feedback drove two weeks of prompt refinement I hadn't planned for.
What's next
The core loop works: upload → analyze → generate → share. The next layer is making the output actionable. Right now you see a beautiful redesigned room with no path to actually achieving it. The natural next step is linking concepts to real products — furniture, lighting, paint colors — that approximate the generated design.
That's a harder problem (attribution, affiliate relationships, inventory data) but it's the right problem if Inhabit is going to be genuinely useful rather than just fun to play with.
For now, the tool is free. Try it on any room photo you have — no signup required.
See how Inhabit compares to other AI interior design tools: Best AI Interior Design Tools 2026 →
See your room redesigned in 30 seconds
No signup. No credit card. Drop a photo and get three concepts.
Try Inhabit free →