How to Build an AI-Powered Restaurant Recommender Like 'Vibe Code' Using LLMs and ClickHouse
End-to-end guide to building a social, AI-driven restaurant recommender using LLMs and ClickHouse—architecture, prompts, data model, and analytics.
Stop guessing where to eat: build a fast, social restaurant recommender that actually understands your group's vibe
Decision fatigue is real. Developers and product teams building social dining tools face a recurring problem: how to combine personal taste, real-time context (who's available, budget, distance), and explainable recommendations without shipping a heavyweight ML stack. In 2026, with lightweight LLMs, affordable embeddings, and ClickHouse's real-time OLAP + vector capabilities, you can build a micro app like Rebecca Yu’s Where2Eat — but production-ready, auditable, and scalable.
What you’ll get from this guide
- End-to-end architecture for a social restaurant recommender using LLMs for natural language understanding and explanations, and ClickHouse for fast analytics and vector search.
- Concrete data model and ClickHouse schemas you can copy-and-adapt.
- Prompt templates and prompt engineering patterns for reliable suggestions and constrained reasoning.
- Real-time pipelines, ranking strategy, and observability/analytics to measure success.
- Security, privacy, and cost trade-offs for 2026 deployments.
Why build this now (2026 context)
By late 2025 and into 2026 the LLM and analytics landscape matured in ways that enable micro apps to be both cheap and powerful. Open and efficient models, multi-modal capabilities, and improvements in vector search mean you don't need a billion-dollar data warehouse to run personalized recommender features. ClickHouse's runway (notably its major funding and rapid product advances in 2025–26) pushed vector search and real-time aggregation features into mainstream infra — ideal for a small, latency-sensitive recommender app.
High-level architecture
Design principles: keep the user flow snappy, make recommendations explainable, and separate retrieval from generation for cost control. Here's a pragmatic architecture for a micro app:
- Edge API / Inference Gateway: routes requests to the appropriate LLM or embedding model, enforces rate limits, and applies caching.
- Retrieval Layer (ClickHouse): stores restaurant metadata, embeddings, interactions, and serves fast vector + filter queries to return candidate lists.
- Reranker (LLM or lightweight model): takes the candidate set + chat context and produces a ranked list with short natural-language rationale.
- Event Stream (Kafka/Pulsar): streams interactions back into ClickHouse for analytics and model-feedback loops.
- Observability & Dashboard: Superset/Metabase or custom UI backed by ClickHouse for metrics like precision@k, latency, and conversion.
Why ClickHouse?
ClickHouse lets you combine OLAP-scale aggregation and real-time ingestion with efficient vector retrieval. That means a single primary datastore can answer both “which restaurants match this embedding?” and “how often did this suggestion lead to a reservation?” without expensive ETL. For a micro app you get speed and lower operational complexity.
Data model: what to store
Keep the model simple but query-friendly. You’ll need four core types of objects:
- Restaurant catalog: static metadata (name, cuisine, price, hours, lat/lon, tags, menu links, verified cleanliness/safety flags).
- Embeddings: dense vectors representing restaurant descriptions, menus, and images (if using multi-modal embeddings).
- User & group profiles: preference signals and short text bios; for privacy, keep PII minimal and store hashed IDs.
- Interactions/events: impressions, clicks, saves, RSVPs, and chat messages — streamed into ClickHouse for analytics and training signals.
Example ClickHouse schema
Below are compact table definitions you can adapt. These use ClickHouse's MergeTree engine for fast writes and queries. For embeddings we'll store as Array(Float32) and compute similarity using SQL array functions.
-- restaurants table
CREATE TABLE restaurants (
restaurant_id UInt64,
name String,
description String,
cuisine Array(String),
price_tier UInt8,
lat Float64,
lon Float64,
tags Array(String),
embedding Array(Float32),
updated_at DateTime
) ENGINE = MergeTree()
ORDER BY (restaurant_id);
-- interactions/events (streamed)
CREATE TABLE events (
event_time DateTime,
user_id UInt64,
group_id UInt64,
event_type String, -- impression/click/rsvp
restaurant_id UInt64,
metadata JSON
) ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/events', '{replica}')
ORDER BY (event_time);
Note: some deployments use ClickHouse's newer vector column types; if yours supports it, switch embedding to the native vector type for speed. The Array(Float32) approach is portable and works well.
Retrieval + Reranking pipeline
Split the recommendation into two steps for both cost and responsiveness:
- Retrieve (ClickHouse): run a hybrid query that uses vector similarity to get ~50 candidates, combined with deterministic filters (open now, within walking distance, fits budget).
- Rerank (LLM / lightweight cross-encoder): feed the small candidate set plus the chat context to an LLM for personalization, explanation, and constraint handling (e.g., allergies).
Candidate retrieval SQL (cosine similarity)
If your ClickHouse doesn't provide a built-in cosine function, compute it with SQL array ops. This example returns the top 50 restaurants closest to the query embedding while filtering by price and distance.
SELECT
restaurant_id,
name,
price_tier,
tags,
arraySum(arrayMap((x,y)->x*y, embedding, query_emb)) AS dot,
sqrt(arraySum(arrayMap(x->x*x, embedding))) AS norm_r,
sqrt(arraySum(arrayMap(x->x*x, query_emb))) AS norm_q,
dot / (norm_r * norm_q) AS cosine
FROM restaurants
WHERE price_tier <= 2
AND hypot(lat - {user_lat}, lon - {user_lon}) < {max_dist}
ORDER BY cosine DESC
LIMIT 50;
Run this query from the inference gateway after generating the query embedding with whichever embedding model you use. Keep embeddings normalized on insert to avoid repeated norm calculations.
Prompt engineering: templates that work
In 2026, robust prompt patterns focus on constrained generation, provenance, and token budget control. Use system-level instructions and give the LLM only the necessary context: short chat history, user preferences, and the pre-ranked candidate block.
Two-step prompt pattern
- Rerank prompt — concise, deterministic instructions to sort candidates and produce a short explanation per item.
- Explain/Share prompt — produce a human-facing message for the group chat, including 1-sentence rationale and confidence.
System: You are an assistant that ranks restaurants for a small group. Follow constraints exactly. Do not hallucinate menu items or ratings.
User: Conversation: "Three of us — two vegans, one loves spicy food. Budget: $$. Open at 7pm. 1. Prefer somewhere walkable."
Candidates:
1) Name: Spice Garden; Tags: spicy, thai; Score: 0.97
2) Name: Green Table; Tags: vegan, casual; Score: 0.88
3) Name: Cafe North; Tags: coffee, pastries; Score: 0.65
Task: Return a JSON array of top 3 with keys {id, name, final_rank, reason (15-25 words), confidence (0.0-1.0)}. Keep reasons factual (cite tags and distance if relevant). If candidate violates constraints (closed, not vegan options), exclude it.
Why this works: provide the LLM with vetted candidates (reduces hallucination) and force structured output for easy parsing.
Handling hard constraints and safety
- Allergies & dietary rules: enforce in the retrieval filter first. If retrieval cannot guarantee, have the LLM flag uncertainty rather than inventing facts.
- Open/closed hours: prefer deterministic checks against the restaurant metadata.
- Content safety: apply content filters on chat input before sending to the LLM. Log user consent for storing chat if you plan to use it for training.
Real-time analytics and feedback loop
One of the advantages of building on ClickHouse is fast iterative analytics. Stream every event (impression, click, save, RSVP) into ClickHouse and define materialized views for product metrics:
CREATE MATERIALIZED VIEW mv_daily_metrics TO daily_metrics AS
SELECT
toDate(event_time) AS day,
count(*) AS total_events,
countIf(event_type='click') AS clicks,
countIf(event_type='rsvp') AS rsvps,
clicks / total_events AS ctr
FROM events
GROUP BY day;
Key metrics to track:
- Precision@K — percentage of top K suggestions that lead to a click/reservation.
- CTR and conversion — impression to click and click to RSVP rates.
- Latency p95 — end-to-end recommendation latency (target <1s for micro apps; can accept longer for LLM reranking if UX is async).
- Diversity — ensure repeated suggestions are not stale across sessions for the same group.
Performance, cost, and hybrid ranking
LLM calls are the most expensive part. Use a hybrid approach:
- Use ClickHouse vector retrieval for the heavy lifting.
- Apply a cheap learned or rule-based reranker for low-latency interactions (e.g., machine-learned linear model). Use the LLM only for final human-readable explanations or weekly personalization updates.
- Cache results for repeated queries (same group + same context) to avoid recomputation.
Personalization & cold-start
For micro apps used by small friend groups, personalization must work with limited data:
- Profile seeding: seed users with quick preference sliders (spicy, quiet, budget) at signup.
- Group embeddings: compute a group embedding by averaging member embeddings plus the chat context embedding; use that as the query vector.
- Cold start: fall back to popularity + recency + cuisine diversity when user data is sparse.
Observability & debugging
Make decisions auditable. Store the candidate list and the reranker input/output for every recommendation request (redact PII). This makes it possible to debug when the app suggests inappropriate options and iteratively improve prompts.
Privacy and compliance (practical rules)
- Keep personally identifying data minimal and encrypted. Use hashed user IDs for analytics.
- Offer an opt-out for using chat data for model improvement; store opt-outs in ClickHouse and respect them upstream.
- Local vs cloud models: if sensitive data is involved, prefer self-hosted inference or deploy “no-log” contract models.
Example implementation checklist (step-by-step)
- Gather a restaurant catalog (initial seed: OpenTable/Yelp APIs, manual CSV). Normalize schema to the restaurants table above.
- Pick embedding model and generate embeddings for restaurants. Insert into ClickHouse (normalize vectors).
- Build a simple client UI to capture group members and chat context.
- Implement the inference gateway: generate query embedding, call ClickHouse retrieve query, call reranker LLM with top candidates, return JSON to client.
- Stream interaction events to ClickHouse and build dashboards (CTR, precision@K, latency).
- Iterate on prompts and rules; add caching and a cheap reranker to control costs.
Benchmarking & expected numbers (practical targets)
Use these as starting SLAs for a micro app in 2026:
- Recommendation latency: p50 < 400ms (retrieval + cheap rerank), full LLM explanation p95 < 1.5s.
- Cost per active recommendation: target <$0.02 using hybrid rerank & caching (highly model-dependent).
- Precision@5: aim for 0.35–0.5 in early iterations; improve with feedback loop and personalization.
Future-proofing & 2026 trends
Trends to watch and adopt:
- On-device LLMs: increasing feasibility for private, offline recommendations on mobile devices.
- Multi-modal embeddings: combining menu text, photos, and short reviews into a joint embedding space for richer retrieval.
- ClickHouse vector acceleration: continued improvements mean more of your retrieval and analytics can live in the same store.
- Explainability as a feature: users prefer transparent reasons; LLMs make this cheap and natural.
Make the recommender auditable: store candidates, model inputs, and outputs so product and legal teams can explain any suggestion.
Common pitfalls and how to avoid them
- Hallucinations: avoid passing raw catalog-free prompts to the LLM. Always feed vetted candidate data for generation.
- Latency surprises: measure end-to-end and isolate LLM cost vs retrieval cost. Add a cheap reranker for quick responses.
- Privacy leaks: never store raw chat transcripts without explicit consent; use hashing and redaction.
- Overfitting to popularity: enforce diversity constraints in reranking and periodically surface niche options.
Actionable takeaways
- Combine ClickHouse vector retrieval with an LLM reranker: retrieval for scale, LLM for nuance.
- Use structured prompts and filtered candidate inputs to prevent hallucinations and keep generation deterministic.
- Stream all events into ClickHouse and iterate quickly on metrics like precision@K, CTR, and latency.
- Start small: a micro app with a simple data model gets you to usable recommendations fast; add personalization after you have event signals.
Next steps & call to action
If you want a jumpstart, clone a starter repo (backend + ClickHouse schema + prompt templates), seed it with a small catalog, and run a 48-hour experiment with a friend group. Measure CTR and watch the top suggestions evolve — you’ll be surprised how fast it improves with real interactions.
Ready to build? Start by defining your minimal catalog and wiring ClickHouse for retrieval. If you'd like, download our ClickHouse schema and prompt templates to get a working prototype in under a week — then iterate using the analytics patterns above.
Related Reading
- Running Large Language Models on Compliant Infrastructure: SLA, Auditing & Cost Considerations
- Beyond Serverless: Designing Resilient Cloud‑Native Architectures for 2026
- Free-tier face-off: Cloudflare Workers vs AWS Lambda for EU-sensitive micro-apps
- How Micro-Apps Are Reshaping Small Business Document Workflows in 2026
- Modest Makers Spotlight: Ethical Pet Clothing Brands and What to Ask Before You Buy
- How to negotiate a developer buyout or community takeover when a game is sunset
- From Bankruptcy to Big Deals: Timeline of Vice Media’s Restructuring Moves
- The Ultimate Guide to Biking With Your Dog: Using Affordable Electric Bikes Safely
- Where to Buy the New TMNT Magic: The Gathering Set — Best Preorder Options for Families
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you