Stuck in decision fatigue? Build a tiny, personal restaurant recommender that runs on a Raspberry Pi — no massive cloud bill required.
Students, teachers, and lifelong learners: you don’t need to buy an expensive course or a production-grade team to ship a practical portfolio project that blends no-code micro apps with local AI inference. In this hands-on guide (2026 edition) I’ll walk you through building a compact restaurant recommender inspired by Rebecca Yu’s vibe-coding project, deploy it to a Raspberry Pi, and show two interchangeable inference options: a cloud LLM (ChatGPT or Anthropic Claude) and a local model running on-device.
Why this project matters in 2026
Micro apps — tiny, personal, single-purpose applications — exploded in popularity after 2023 and matured into reliable tools by late 2025. They are perfect portfolio pieces because they show you can identify a real need, prototype fast, and ship a working product. Meanwhile, hardware improvements (Raspberry Pi 5 + AI HAT+ 2) and better small-model runtimes make local inference viable for many micro apps.
“It took Rebecca Yu seven days to vibe code her dining app.” — TechCrunch summary of Rebecca Yu’s Where2Eat project (inspiration)
In short: this project demonstrates modern trends (micro apps, local AI, and hybrid cloud/local inference) and produces a deployable Node.js microservice plus a no-code or low-code frontend you can show off on your resume.
Project overview — what you'll build
- Frontend: a lightweight no-code micro app or static UI (Glide/Retool/Simple HTML) to collect user preferences and display recommendations.
- Backend: a Node.js microservice that fetches restaurant data, scores options, and calls an LLM for refined suggestions.
- Inference: two interchangeable backends — (A) cloud LLM (OpenAI ChatGPT or Anthropic Claude) via API, and (B) local model running on Raspberry Pi (via llama.cpp / ggml-style runtime or an AI HAT-enabled binary).
- Deployment: run the app on a Raspberry Pi (systemd + reverse proxy) and expose it to your local network or a private URL.
Prerequisites
- Raspberry Pi 5 (recommended) or Pi 4 with 8GB+ RAM. If you have the AI HAT+ 2 (launched late 2025), you get better inferencing headroom on-device — see hardware field notes like the Nomad Qubit Carrier v1 and other edge testbeds for context.
- Basic familiarity with Node.js, npm, and terminal commands.
- Optional accounts: OpenAI (for ChatGPT API) or Anthropic (for Claude) if you plan to use cloud inference.
- No-code tool choice: Glide, Retool, or a static site generator. You can also use a simple HTML + JS page to keep everything free.
Step 1 — Define the data model
Start with a small JSON file with restaurants and attributes. Keep it simple so you can expand later:
// restaurants.json
[
{ "id": 1, "name": "Sunrise Noodles", "cuisine": "Chinese", "price": 2, "rating": 4.3, "tags": ["noodles","cozy"] },
{ "id": 2, "name": "La Taqueria", "cuisine": "Mexican", "price": 1, "rating": 4.7, "tags": ["tacos","fast"] },
{ "id": 3, "name": "Green Table", "cuisine": "Vegan", "price": 3, "rating": 4.2, "tags": ["healthy","brunch"] }
]
Keep the dataset local for now (JSON or a small SQLite DB). This keeps privacy and makes local-only demos feasible.
Step 2 — Build the Node.js microservice
Create a tiny Express server with one primary endpoint: /recommend. The endpoint accepts user preferences (cuisine, budget, mood) and returns ranked results. Save this as index.js.
const express = require('express');
const fs = require('fs');
const bodyParser = require('body-parser');
const fetch = require('node-fetch'); // or global fetch in recent Node
const RESTAURANTS = JSON.parse(fs.readFileSync('./restaurants.json'));
const app = express();
app.use(bodyParser.json());
const useLocal = process.env.USE_LOCAL === 'true';
const localEndpoint = 'http://localhost:5000/llm'; // example local LLM adapter
function scoreSimple(pref, r) {
// Simple deterministic score: rating + cuisine match + budget proximity
let score = r.rating;
if (pref.cuisine && pref.cuisine.toLowerCase() === r.cuisine.toLowerCase()) score += 1.0;
score -= Math.abs((pref.budget || 2) - r.price) * 0.3;
if (pref.tags) {
r.tags.forEach(t => { if (pref.tags.includes(t)) score += 0.2; });
}
return score;
}
async function callLLM(prompt, provider='openai'){
if (useLocal) {
const res = await fetch(localEndpoint, {
method: 'POST',
headers: {'Content-Type':'application/json'},
body: JSON.stringify({prompt})
});
return (await res.json()).text;
} else {
// Cloud example: OpenAI Chat Completions (replace with your provider)
const key = process.env.OPENAI_API_KEY;
const r = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: { 'Authorization': `Bearer ${key}`, 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'gpt-4o-mini', // example modern model
messages:[{role:'user',content:prompt}],
temperature:0.7
})
});
const j = await r.json();
return j.choices?.[0]?.message?.content || '';
}
}
app.post('/recommend', async (req, res) => {
const prefs = req.body;
// quick deterministic candidates
const scored = RESTAURANTS.map(r => ({...r, score: scoreSimple(prefs, r)}))
.sort((a,b) => b.score - a.score)
.slice(0,5);
// refine with LLM for personalized explanations and tiebreakers
const prompt = `User preferences: ${JSON.stringify(prefs)}\nCandidates: ${JSON.stringify(scored)}\nGive me a ranked list of these restaurants with a short reason for each (1-2 sentences).`;
const llmText = await callLLM(prompt);
res.json({candidates: scored, llmExplanation: llmText});
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => console.log(`Server running on ${PORT}`));
This pattern does two useful things: a cheap deterministic ranking for offline speed, plus an LLM-based refinement for better user-facing suggestions when an LLM is available.
Choosing cloud vs local inference
- Cloud (ChatGPT / Claude): best for quality, natural language nuance, and quick iteration. Expect latency and API costs — see reviews of cloud cost and observability tooling when planning your demo budget. Use when your demo audience is small or you have a budget for API calls.
- Local: best for privacy, low recurring cost, and offline capability. Tradeoffs: smaller models, potential accuracy and latency limits. Great for demonstrating edge deployments and applied ML skills; see hardware and edge-AI field notes like Edge AI for Retail for performance expectations.
Step 3 — Local inference on Raspberry Pi (2026 tips)
By late 2025–early 2026, the Raspberry Pi 5 (and the new AI HAT+ 2) made local LLM inference realistic for micro apps. You’ll typically run a lightweight runtime (llama.cpp, ggml/gguf runtimes, or an optimized vendor binary) that exposes a simple HTTP API the Node.js service can call.
Install base OS and Node
- Flash Raspberry Pi OS 64-bit or Ubuntu 22.04/24.04 (64-bit).
- Install Node.js 20+ (or use nvm).
- Optional: Install Docker if you prefer containerized runtimes.
Run a local LLM adapter
Several community runtimes expose a REST API wrapper around a local model. The simplest is to run a small adapter that listens on port 5000 and accepts a prompt. For performance, use a quantized gguf model tuned for edge devices. Example adapter (pseudo):
// local-llm-adapter.js (very small wrapper)
const express = require('express');
const {spawn} = require('child_process');
const bodyParser = require('body-parser');
const app = express();
app.use(bodyParser.json());
app.post('/llm', (req, res) => {
const prompt = req.body.prompt;
// call a local binary (llama.cpp, or an AI-HAT binary) — adjust flags per runtime
const child = spawn('./bin/llm-local', ['--prompt', prompt]);
let out = '';
child.stdout.on('data', d => out += d.toString());
child.on('close', () => res.json({text: out}));
});
app.listen(5000);
Notes:
- Use a small 4-bit quantized model to fit RAM; 2026 runtimes often use GGUF quantization to balance speed and quality — see compact model guides and community notes like those in the Portable Study Kits roundup for device-friendly choices.
- If you have an AI HAT+ 2, follow the vendor docs to accelerate inference — it can change the difference between unusable and responsive.
Step 4 — No-code frontend options
For a micro app you don’t need a heavy React app. Pick one of these low-friction routes:
- Glide / Softr: Connect a Google Sheet or Airtable and call your Node.js endpoint with a webhook to fetch recommendations. Good for fast demos and shareable UI.
- Retool / AppSmith: Build internal-facing micro apps and connect to your Node endpoint for richer controls. If you plan to scale or show governance, check micro-app governance patterns.
- Static HTML + Fetch: Hand-code a small page that posts to /recommend and renders results. This is ideal for a portfolio because it shows you wrote the glue code.
Example minimal client (static):
<form id="prefs">
<input name="cuisine" placeholder="Cuisine"/>
<input name="budget" placeholder="Budget 1-3"/>
<button>Recommend</button>
</form>
<div id="out"></div>
<script>
document.getElementById('prefs').addEventListener('submit', async e => {
e.preventDefault();
const form = e.target;
const prefs = { cuisine: form.cuisine.value, budget: Number(form.budget.value) };
const r = await fetch('/recommend', {method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify(prefs)});
const j = await r.json();
document.getElementById('out').innerText = JSON.stringify(j, null, 2);
});
</script>
Step 5 — Deploy to Raspberry Pi
- Clone your project on the Pi and install npm deps.
- Install your local LLM adapter (if using). Place the quantized model on disk and ensure the runtime can access it.
- Create a systemd service for your Node app so it restarts automatically.
- Use Caddy or Nginx as a reverse proxy to add TLS and friendly URLs. Caddy simplifies automatic HTTPS issuance on public IPs.
Example systemd unit (save as /etc/systemd/system/restaurant.service):
[Unit]
Description=Micro Restaurant Recommender
After=network.target
[Service]
ExecStart=/usr/bin/node /home/pi/project/index.js
WorkingDirectory=/home/pi/project
Restart=always
User=pi
Environment=USE_LOCAL=true
[Install]
WantedBy=multi-user.target
Step 6 — Performance, cost, and UX tradeoffs
By 2026, hybrid deployments are the pragmatic default:
- Local-first, cloud-optional: use deterministic scoring first, then call local LLM for on-device personalization. Fallback to cloud LLM only if local inference is slow or you need higher quality responses — an approach aligned with edge-first, cost-aware strategies.
- Quantize to fit memory: 4-bit quantization and GGUF models are the norm for edge devices; check your runtime docs for guidance and model notes in community hardware roundups like Portable Study Kits.
- Batch and cache: cache LLM explanations or precompute rankings during low-load times to reduce latency and costs — monitoring cost signals (and tools) can help, see cloud observability reviews.
Prompts that work — ChatGPT vs Claude vs Local
Design prompts that supply structured context. Example (short):
Prompt:
User preferences: {"cuisine":"Mexican","budget":1,"occasion":"quick lunch"}
Candidates: [ {id:2, name:"La Taqueria", score:4.6, ...}, ... ]
Task: Return a ranked list (1-3) with 1 line reason per item, consider speed and price for quick lunch.
Notes:
- ChatGPT-style models are generally stronger at conversational reasoning.
- Anthropic Claude often focuses on safer, structured outputs and can be easier to prompt for tabular responses.
- Local models vary — tune prompts and temperature conservatively and rely on deterministic scoring for critical constraints (e.g., affordability).
Real-world examples & case study
Rebecca Yu’s Where2Eat (2023 coverage) is a classic micro app example: quick, personal, and problem-driven. In late 2025, the hardware improvements made similar personal apps far more powerful on-device. For a class project, students have shipped variants that combine a Glide frontend, a Pi-based Node.js backend, and a tiny LLM for natural language reasoning.
Advanced strategies and future-proofing (2026+)
- Edge orchestration: offload heavy inference to a local NUC or small GPU box if you outgrow Pi performance — see orchestration patterns in edge-aware orchestration.
- Federated preferences: keep user preferences on-device and sync anonymized signals to a central index for group recommendations — this pattern connects to privacy-first preference designs like privacy-first preference centers.
- Modular backends: design your Node.js adaptor to switch between providers by changing an environment variable — this makes demos easier and safer for portfolio review.
- Explainability: surface short LLM explanations so users trust the recommendation; this is a small UX tweak with big impact — creators are also using micro-events and maker communities to test UX in the wild (see maker pop-up strategies).
Troubleshooting checklist
- If inference is slow: verify model quantization, ensure swap is disabled/limited, or upgrade to AI HAT+ 2 / small GPU — field reviews like Nomad Qubit Carrier discuss real-device tradeoffs.
- If responses are hallucinated: rely more on deterministic scores or provide stricter system prompts limiting creative extrapolation.
- If the Pi overheats under load: add a cooling fan or throttle inference. Edge inference is a marathon, not a sprint.
Actionable takeaways
- Start small: ship the Node.js microservice with a JSON dataset first.
- Design for hybrid inference: local deterministic scoring + optional LLM refinement.
- Use a no-code frontend to iterate UI quickly; replace with custom UI later to show engineering depth.
- Deploy on Raspberry Pi to demonstrate real-device skills and edge deployment know-how.
Resources & further reading (2026)
- TechCrunch coverage of Rebecca Yu’s Where2Eat for inspiration.
- ZDNET coverage of the Raspberry Pi AI HAT+ 2 (late 2025) for hardware context.
- Vendor docs for your chosen local runtime (llama.cpp, ggml/gguf adaptors) — follow current install and quantization guides.
Final notes — what to show on your portfolio
When you publish this project, highlight:
- Problem statement and user stories: decision fatigue in group chats.
- Architecture diagram: frontend (no-code) → Node.js → LLM (local/cloud).
- Tradeoffs: latency vs. cost, quality vs. privacy, and how you mitigated each.
Ready to build it?
This project bridges no-code UX and hands-on backend deployment — a perfect portfolio piece for 2026. Start by cloning your dataset and spinning up the Node.js microservice. Flip the USE_LOCAL switch to test local inference on your Pi, or point the connector at ChatGPT/Claude to compare quality. Most importantly: iterate fast, keep the scope bite-sized, and document each decision — hiring managers care about tradeoffs, not perfection.
Call to action: Fork the starter repo, deploy to a Raspberry Pi, and tweet your demo with the tag #microappPi so the community can try your recommender. Want a starter repo or a checked list for your class? Reply and I’ll provide a ready-to-run template plus a simple Glide sheet to hook into your Node.js endpoint.
Related Reading
- Micro Apps at Scale: Governance and Best Practices for IT Admins
- Edge-First, Cost-Aware Strategies for Microteams in 2026
- Field Review: Nomad Qubit Carrier v1 — Mobile Testbeds
- Portable Study Kits and On-Device Tools — 2026 Roundup
- Sustainable Packaging Lessons from Slim Retailers: Small Luxury, Big Impact
- Print & Merchandise Playbook: Turning Graphic Novel IP into Posters, Prints, and Zines
- Account Takeovers and Your Sealed Records: Threat Models for E-sign Platforms
- How Weak Data Management Undermines AI Trading Strategies — and How to Fix It
- From Onesies to Open Houses: What Indie Game Characters Teach Us About Relatable Listing Copy