Build a Micro Restaurant Recommender: From ChatGPT Prompts to a Raspberry Pi-Powered Micro App
Build a Raspberry Pi-powered micro restaurant recommender combining no-code frontends and local or cloud LLM inference—step-by-step (2026).
Stuck in decision fatigue? Build a tiny, personal restaurant recommender that runs on a Raspberry Pi — no massive cloud bill required.
Students, teachers, and lifelong learners: you don’t need to buy an expensive course or a production-grade team to ship a practical portfolio project that blends no-code micro apps with local AI inference. In this hands-on guide (2026 edition) I’ll walk you through building a compact restaurant recommender inspired by Rebecca Yu’s vibe-coding project, deploy it to a Raspberry Pi, and show two interchangeable inference options: a cloud LLM (ChatGPT or Anthropic Claude) and a local model running on-device.
Why this project matters in 2026
Micro apps — tiny, personal, single-purpose applications — exploded in popularity after 2023 and matured into reliable tools by late 2025. They are perfect portfolio pieces because they show you can identify a real need, prototype fast, and ship a working product. Meanwhile, hardware improvements (Raspberry Pi 5 + AI HAT+ 2) and better small-model runtimes make local inference viable for many micro apps.
“It took Rebecca Yu seven days to vibe code her dining app.” — TechCrunch summary of Rebecca Yu’s Where2Eat project (inspiration)
In short: this project demonstrates modern trends (micro apps, local AI, and hybrid cloud/local inference) and produces a deployable Node.js microservice plus a no-code or low-code frontend you can show off on your resume.
Project overview — what you'll build
- Frontend: a lightweight no-code micro app or static UI (Glide/Retool/Simple HTML) to collect user preferences and display recommendations.
- Backend: a Node.js microservice that fetches restaurant data, scores options, and calls an LLM for refined suggestions.
- Inference: two interchangeable backends — (A) cloud LLM (OpenAI ChatGPT or Anthropic Claude) via API, and (B) local model running on Raspberry Pi (via llama.cpp / ggml-style runtime or an AI HAT-enabled binary).
- Deployment: run the app on a Raspberry Pi (systemd + reverse proxy) and expose it to your local network or a private URL.
Prerequisites
- Raspberry Pi 5 (recommended) or Pi 4 with 8GB+ RAM. If you have the AI HAT+ 2 (launched late 2025), you get better inferencing headroom on-device — see hardware field notes like the Nomad Qubit Carrier v1 and other edge testbeds for context.
- Basic familiarity with Node.js, npm, and terminal commands.
- Optional accounts: OpenAI (for ChatGPT API) or Anthropic (for Claude) if you plan to use cloud inference.
- No-code tool choice: Glide, Retool, or a static site generator. You can also use a simple HTML + JS page to keep everything free.
Step 1 — Define the data model
Start with a small JSON file with restaurants and attributes. Keep it simple so you can expand later:
// restaurants.json
[
{ "id": 1, "name": "Sunrise Noodles", "cuisine": "Chinese", "price": 2, "rating": 4.3, "tags": ["noodles","cozy"] },
{ "id": 2, "name": "La Taqueria", "cuisine": "Mexican", "price": 1, "rating": 4.7, "tags": ["tacos","fast"] },
{ "id": 3, "name": "Green Table", "cuisine": "Vegan", "price": 3, "rating": 4.2, "tags": ["healthy","brunch"] }
]
Keep the dataset local for now (JSON or a small SQLite DB). This keeps privacy and makes local-only demos feasible.
Step 2 — Build the Node.js microservice
Create a tiny Express server with one primary endpoint: /recommend. The endpoint accepts user preferences (cuisine, budget, mood) and returns ranked results. Save this as index.js.
const express = require('express');
const fs = require('fs');
const bodyParser = require('body-parser');
const fetch = require('node-fetch'); // or global fetch in recent Node
const RESTAURANTS = JSON.parse(fs.readFileSync('./restaurants.json'));
const app = express();
app.use(bodyParser.json());
const useLocal = process.env.USE_LOCAL === 'true';
const localEndpoint = 'http://localhost:5000/llm'; // example local LLM adapter
function scoreSimple(pref, r) {
// Simple deterministic score: rating + cuisine match + budget proximity
let score = r.rating;
if (pref.cuisine && pref.cuisine.toLowerCase() === r.cuisine.toLowerCase()) score += 1.0;
score -= Math.abs((pref.budget || 2) - r.price) * 0.3;
if (pref.tags) {
r.tags.forEach(t => { if (pref.tags.includes(t)) score += 0.2; });
}
return score;
}
async function callLLM(prompt, provider='openai'){
if (useLocal) {
const res = await fetch(localEndpoint, {
method: 'POST',
headers: {'Content-Type':'application/json'},
body: JSON.stringify({prompt})
});
return (await res.json()).text;
} else {
// Cloud example: OpenAI Chat Completions (replace with your provider)
const key = process.env.OPENAI_API_KEY;
const r = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'gpt-4o-mini', // example modern model
messages:[{role:'user',content:prompt}],
temperature:0.7
})
});
const j = await r.json();
return j.choices?.[0]?.message?.content || '';
}
}
app.post('/recommend', async (req, res) => {
const prefs = req.body;
// quick deterministic candidates
const scored = RESTAURANTS.map(r => ({...r, score: scoreSimple(prefs, r)}))
.sort((a,b) => b.score - a.score)
.slice(0,5);
// refine with LLM for personalized explanations and tiebreakers
const prompt = `User preferences: ${JSON.stringify(prefs)}\nCandidates: ${JSON.stringify(scored)}\nGive me a ranked list of these restaurants with a short reason for each (1-2 sentences).`;
const llmText = await callLLM(prompt);
res.json({candidates: scored, llmExplanation: llmText});
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Server running on ${PORT}`));
This pattern does two useful things: a cheap deterministic ranking for offline speed, plus an LLM-based refinement for better user-facing suggestions when an LLM is available.
Choosing cloud vs local inference
- Cloud (ChatGPT / Claude): best for quality, natural language nuance, and quick iteration. Expect latency and API costs — see reviews of cloud cost and observability tooling when planning your demo budget. Use when your demo audience is small or you have a budget for API calls.
- Local: best for privacy, low recurring cost, and offline capability. Tradeoffs: smaller models, potential accuracy and latency limits. Great for demonstrating edge deployments and applied ML skills; see hardware and edge-AI field notes like Edge AI for Retail for performance expectations.
Step 3 — Local inference on Raspberry Pi (2026 tips)
By late 2025–early 2026, the Raspberry Pi 5 (and the new AI HAT+ 2) made local LLM inference realistic for micro apps. You’ll typically run a lightweight runtime (llama.cpp, ggml/gguf runtimes, or an optimized vendor binary) that exposes a simple HTTP API the Node.js service can call.
Install base OS and Node
- Flash Raspberry Pi OS 64-bit or Ubuntu 22.04/24.04 (64-bit).
- Install Node.js 20+ (or use nvm).
- Optional: Install Docker if you prefer containerized runtimes.
Run a local LLM adapter
Several community runtimes expose a REST API wrapper around a local model. The simplest is to run a small adapter that listens on port 5000 and accepts a prompt. For performance, use a quantized gguf model tuned for edge devices. Example adapter (pseudo):
// local-llm-adapter.js (very small wrapper)
const express = require('express');
const {spawn} = require('child_process');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.post('/llm', (req, res) => {
const prompt = req.body.prompt;
// call a local binary (llama.cpp, or an AI-HAT binary) — adjust flags per runtime
const child = spawn('./bin/llm-local', ['--prompt', prompt]);
let out = '';
child.stdout.on('data', d => out += d.toString());
child.on('close', () => res.json({text: out}));
});
app.listen(5000);
Notes:
- Use a small 4-bit quantized model to fit RAM; 2026 runtimes often use GGUF quantization to balance speed and quality — see compact model guides and community notes like those in the Portable Study Kits roundup for device-friendly choices.
- If you have an AI HAT+ 2, follow the vendor docs to accelerate inference — it can change the difference between unusable and responsive.
Step 4 — No-code frontend options
For a micro app you don’t need a heavy React app. Pick one of these low-friction routes:
- Glide / Softr: Connect a Google Sheet or Airtable and call your Node.js endpoint with a webhook to fetch recommendations. Good for fast demos and shareable UI.
- Retool / AppSmith: Build internal-facing micro apps and connect to your Node endpoint for richer controls. If you plan to scale or show governance, check micro-app governance patterns.
- Static HTML + Fetch: Hand-code a small page that posts to /recommend and renders results. This is ideal for a portfolio because it shows you wrote the glue code.
Example minimal client (static):
<form id="prefs">
<input name="cuisine" placeholder="Cuisine"/>
<input name="budget" placeholder="Budget 1-3"/>
<button>Recommend</button>
</form>
<div id="out"></div>
<script>
document.getElementById('prefs').addEventListener('submit', async e => {
e.preventDefault();
const form = e.target;
const prefs = { cuisine: form.cuisine.value, budget: Number(form.budget.value) };
const r = await fetch('/recommend', {method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify(prefs)});
const j = await r.json();
document.getElementById('out').innerText = JSON.stringify(j, null, 2);
});
</script>
Step 5 — Deploy to Raspberry Pi
- Clone your project on the Pi and install npm deps.
- Install your local LLM adapter (if using). Place the quantized model on disk and ensure the runtime can access it.
- Create a systemd service for your Node app so it restarts automatically.
- Use Caddy or Nginx as a reverse proxy to add TLS and friendly URLs. Caddy simplifies automatic HTTPS issuance on public IPs.
Example systemd unit (save as /etc/systemd/system/restaurant.service):
[Unit]
Description=Micro Restaurant Recommender
After=network.target
[Service]
ExecStart=/usr/bin/node /home/pi/project/index.js
WorkingDirectory=/home/pi/project
Restart=always
User=pi
Environment=USE_LOCAL=true
[Install]
WantedBy=multi-user.target
Step 6 — Performance, cost, and UX tradeoffs
By 2026, hybrid deployments are the pragmatic default:
- Local-first, cloud-optional: use deterministic scoring first, then call local LLM for on-device personalization. Fallback to cloud LLM only if local inference is slow or you need higher quality responses — an approach aligned with edge-first, cost-aware strategies.
- Quantize to fit memory: 4-bit quantization and GGUF models are the norm for edge devices; check your runtime docs for guidance and model notes in community hardware roundups like Portable Study Kits.
- Batch and cache: cache LLM explanations or precompute rankings during low-load times to reduce latency and costs — monitoring cost signals (and tools) can help, see cloud observability reviews.
Prompts that work — ChatGPT vs Claude vs Local
Design prompts that supply structured context. Example (short):
Prompt:
User preferences: {"cuisine":"Mexican","budget":1,"occasion":"quick lunch"}
Candidates: [ {id:2, name:"La Taqueria", score:4.6, ...}, ... ]
Task: Return a ranked list (1-3) with 1 line reason per item, consider speed and price for quick lunch.
Notes:
- ChatGPT-style models are generally stronger at conversational reasoning.
- Anthropic Claude often focuses on safer, structured outputs and can be easier to prompt for tabular responses.
- Local models vary — tune prompts and temperature conservatively and rely on deterministic scoring for critical constraints (e.g., affordability).
Real-world examples & case study
Rebecca Yu’s Where2Eat (2023 coverage) is a classic micro app example: quick, personal, and problem-driven. In late 2025, the hardware improvements made similar personal apps far more powerful on-device. For a class project, students have shipped variants that combine a Glide frontend, a Pi-based Node.js backend, and a tiny LLM for natural language reasoning.
Advanced strategies and future-proofing (2026+)
- Edge orchestration: offload heavy inference to a local NUC or small GPU box if you outgrow Pi performance — see orchestration patterns in edge-aware orchestration.
- Federated preferences: keep user preferences on-device and sync anonymized signals to a central index for group recommendations — this pattern connects to privacy-first preference designs like privacy-first preference centers.
- Modular backends: design your Node.js adaptor to switch between providers by changing an environment variable — this makes demos easier and safer for portfolio review.
- Explainability: surface short LLM explanations so users trust the recommendation; this is a small UX tweak with big impact — creators are also using micro-events and maker communities to test UX in the wild (see maker pop-up strategies).
Troubleshooting checklist
- If inference is slow: verify model quantization, ensure swap is disabled/limited, or upgrade to AI HAT+ 2 / small GPU — field reviews like Nomad Qubit Carrier discuss real-device tradeoffs.
- If responses are hallucinated: rely more on deterministic scores or provide stricter system prompts limiting creative extrapolation.
- If the Pi overheats under load: add a cooling fan or throttle inference. Edge inference is a marathon, not a sprint.
Actionable takeaways
- Start small: ship the Node.js microservice with a JSON dataset first.
- Design for hybrid inference: local deterministic scoring + optional LLM refinement.
- Use a no-code frontend to iterate UI quickly; replace with custom UI later to show engineering depth.
- Deploy on Raspberry Pi to demonstrate real-device skills and edge deployment know-how.
Resources & further reading (2026)
- TechCrunch coverage of Rebecca Yu’s Where2Eat for inspiration.
- ZDNET coverage of the Raspberry Pi AI HAT+ 2 (late 2025) for hardware context.
- Vendor docs for your chosen local runtime (llama.cpp, ggml/gguf adaptors) — follow current install and quantization guides.
Final notes — what to show on your portfolio
When you publish this project, highlight:
- Problem statement and user stories: decision fatigue in group chats.
- Architecture diagram: frontend (no-code) → Node.js → LLM (local/cloud).
- Tradeoffs: latency vs. cost, quality vs. privacy, and how you mitigated each.
Ready to build it?
This project bridges no-code UX and hands-on backend deployment — a perfect portfolio piece for 2026. Start by cloning your dataset and spinning up the Node.js microservice. Flip the USE_LOCAL switch to test local inference on your Pi, or point the connector at ChatGPT/Claude to compare quality. Most importantly: iterate fast, keep the scope bite-sized, and document each decision — hiring managers care about tradeoffs, not perfection.
Call to action: Fork the starter repo, deploy to a Raspberry Pi, and tweet your demo with the tag #microappPi so the community can try your recommender. Want a starter repo or a checked list for your class? Reply and I’ll provide a ready-to-run template plus a simple Glide sheet to hook into your Node.js endpoint.
Related Reading
- Micro Apps at Scale: Governance and Best Practices for IT Admins
- Edge-First, Cost-Aware Strategies for Microteams in 2026
- Field Review: Nomad Qubit Carrier v1 — Mobile Testbeds
- Portable Study Kits and On-Device Tools — 2026 Roundup
- Sustainable Packaging Lessons from Slim Retailers: Small Luxury, Big Impact
- Print & Merchandise Playbook: Turning Graphic Novel IP into Posters, Prints, and Zines
- Account Takeovers and Your Sealed Records: Threat Models for E-sign Platforms
- How Weak Data Management Undermines AI Trading Strategies — and How to Fix It
- From Onesies to Open Houses: What Indie Game Characters Teach Us About Relatable Listing Copy
Related Topics
webbclass
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you