Search Service
Search Service
Mana Search provides web search and content extraction capabilities for Manacore applications.
Architecture
┌─────────────────────────────────────────────────────────────┐│ Consumer Apps ││ Questions │ Chat │ Project Doc Bot │ Future Apps │└─────────────────────────┬───────────────────────────────────┘ ▼┌─────────────────────────────────────────────────────────────┐│ mana-search (Port 3021) ││ Search API │ Extract API │ Redis Cache │└─────────────────────────┬───────────────────────────────────┘ ▼┌─────────────────────────────────────────────────────────────┐│ SearXNG (Port 8080, internal) ││ Google │ Bing │ DuckDuckGo │ Wikipedia │ arXiv │ ... │└─────────────────────────────────────────────────────────────┘Quick Start
# Start SearXNG + Redis (for local development)cd services/mana-searchdocker-compose -f docker-compose.dev.yml up -d
# Start NestJS APIpnpm --filter @mana-search/service dev
# Or start everything via Dockercd services/mana-searchdocker-compose up -dAPI Endpoints
Web Search
POST /api/v1/searchContent-Type: application/json
{ "query": "quantum computing basics", "options": { "categories": ["general", "science"], "engines": ["google", "wikipedia"], "limit": 10 }}Response:
{ "results": [ { "title": "Quantum Computing - Wikipedia", "url": "https://en.wikipedia.org/wiki/Quantum_computing", "snippet": "Quantum computing is a type of computation...", "engine": "wikipedia" } ], "meta": { "query": "quantum computing basics", "totalResults": 10, "searchTime": 0.523 }}Content Extraction
POST /api/v1/extractContent-Type: application/json
{ "url": "https://example.com/article", "options": { "includeMarkdown": true }}Response:
{ "title": "Article Title", "content": "Full article text...", "markdown": "# Article Title\n\nFull article text...", "metadata": { "author": "John Doe", "publishedAt": "2024-01-15" }}Bulk Extract
POST /api/v1/extract/bulkContent-Type: application/json
{ "urls": [ "https://example.com/article1", "https://example.com/article2" ], "options": { "includeMarkdown": true }}Search Categories
| Category | Engines |
|---|---|
general | Google, Bing, DuckDuckGo, Brave, Wikipedia |
news | Google News, Bing News |
science | arXiv, Google Scholar, PubMed, Semantic Scholar |
it | GitHub, StackOverflow, NPM, MDN |
Usage in Backend
Direct Fetch
const response = await fetch('http://mana-search:3021/api/v1/search', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ query: 'machine learning basics', options: { categories: ['general', 'science'], limit: 5, }, }),});
const { results, meta } = await response.json();Service Class
import { Injectable } from '@nestjs/common';import { ConfigService } from '@nestjs/config';
@Injectable()export class SearchService { private readonly baseUrl: string;
constructor(config: ConfigService) { this.baseUrl = config.get('MANA_SEARCH_URL'); }
async search(query: string, options?: SearchOptions) { const response = await fetch(`${this.baseUrl}/api/v1/search`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ query, options }), });
if (!response.ok) { throw new Error(`Search failed: ${response.status}`); }
return response.json(); }
async extract(url: string, options?: ExtractOptions) { const response = await fetch(`${this.baseUrl}/api/v1/extract`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ url, options }), });
if (!response.ok) { throw new Error(`Extract failed: ${response.status}`); }
return response.json(); }}Caching
Mana Search uses Redis for caching:
| Cache | TTL | Purpose |
|---|---|---|
| Search results | 1 hour | Reduce API calls to search engines |
| Extracted content | 24 hours | Cache parsed articles |
Environment Variables
MANA_SEARCH_URL=http://localhost:3021SEARXNG_URL=http://localhost:8080REDIS_HOST=localhostREDIS_PORT=6379CACHE_SEARCH_TTL=3600CACHE_EXTRACT_TTL=86400Health Check
GET /health
# Response{ "status": "ok", "searxng": "connected", "redis": "connected"}Rate Limiting
The service includes built-in rate limiting:
- Search: 30 requests/minute per IP
- Extract: 60 requests/minute per IP
- Bulk Extract: 10 requests/minute per IP
Error Handling
try { const results = await searchService.search('query');} catch (error) { if (error.status === 429) { // Rate limited - wait and retry } else if (error.status === 503) { // Search engines unavailable }}Best Practices
- Cache aggressively - Search results rarely change
- Use appropriate categories - More specific = better results
- Limit results - Only fetch what you need
- Handle failures gracefully - External services can be unreliable
- Respect rate limits - Implement backoff strategies