Skip to content

Search Summary

ng-druid edited this page Oct 6, 2025 · 1 revision

This entire system, built across your Go packages, constitutes a flexible, feature-rich, in-memory, GitHub-backed Search and Analytics Engine. It is designed to turn structured data stored as JSON files within GitHub repositories into a powerful, queryable knowledge base, bypassing the need for a traditional, managed database or external search service for specific use cases.

Here is a marketing summary of the system, highlighting its core value proposition and key features.


🚀 The Vertigo Search & Analytics Engine: Turn GitHub Data into Actionable Insights

The Vertigo Engine is a custom, lightweight, serverless solution designed to unlock the analytical and search potential of your structured data stored directly within GitHub. It provides a robust API layer for complex querying, real-time scoring, and sophisticated data aggregation without relying on external search infrastructure like Elasticsearch or Algolia.

Core Value Proposition

  • Cost Efficiency & Simplicity: Leverage your existing GitHub infrastructure. Eliminate the operational cost, complexity, and latency associated with provisioning and managing external search clusters or databases for read-heavy, document-based data.

  • Deep Customization: Control every aspect of document scoring, filtering, and linguistic analysis with custom Go template functions, rule-based stemming, and a flexible query DSL.

  • Serverless Scalability: Built on AWS Lambda, the system is designed to scale dynamically, handling complex searches and aggregations on demand.

Key Features and Differentiators

Category | Feature | Benefit -- | -- | -- Indexing & Retrieval | GitHub Native Storage | Directly indexes structured JSON files stored in specific repository paths, making your data source your knowledge base.   | Composite/Scoped Search | Allows queries to be tightly scoped by Composite Keys, ensuring lightning-fast initial data retrieval based on index hierarchy. Query Flexibility | Full Boolean Logic (AND/OR/NOT) | Supports complex filtering via nested All, One, None, and Not clauses.   | Advanced Filter Set | Includes Range, Geo-Distance, Exists/Missing, and Template-based Filters for granular document selection.   | Recursive Subqueries | Supports IN/NOT IN operations based on results fetched from a secondary index, enabling powerful relational lookups across your data. Relevance & Scoring | Custom Match Scoring | Relevance is calculated based on tokenized, stemmed, and analyzed text fields, supporting fuzzy (Levenshtein) and exact matching.   | Function Scoring (FunctionScore) | Allows developers to inject custom Go template functions (e.g., log, sqrt, pow, toFloat64) to dynamically modify and boost document scores based on fields like price, recency, or distance.   | Nested Document Scoring | Finds and returns the maximum score from matches within embedded arrays, perfect for complex document structures. Analytics & Reporting | Hierarchical Aggregations | Provides powerful GroupBy, Numeric Range, and Date Histogram bucketing, with support for nested aggregations.   | Comprehensive Metrics | Calculates Sum, Average, Median, Min, Max, Standard Deviation, Percentile, and Cardinality on grouped data.   | Top Hits Projection | For any aggregation bucket, you can retrieve the top documents, complete with sorting, paging, and field projection.
Clone this wiki locally