GigaBrain Ultra

Subscription social-listening product for brands and researchers. Basic ($96/mo) and Premier ($200/mo) tiers.

Keyword-match + semantic search: advanced search capabilities for brand and market research, not just limited to exact matching

Custom audiences: pick subreddits to watch as dedicated segments

Real-time monitoring: get email alerts when your brand is mentioned, negative sentiment spikes, or a competitor is stealing the spotlight

Historical charts: understand trends by looking back over time

AI analysis dashboards: mention volume, sentiment breakdown, and common themes at a glance

How it works

From the user's side a monitor is simple: name a brand, product, competitor, or topic, optionally scope it to a set of subreddits, and within seconds Ultra returns how often it is mentioned, how sentiment is trending, the themes people raise, and the threads themselves, then keeps all of that current as new posts and comments arrive.

What set it apart from other social-listening tools was speed and semantics. It returned historical results in seconds where competitors took minutes, and you could listen by meaning rather than only by exact keyword. And instead of just counting mentions, it ran an LLM over every match to judge it against the specific subject of your monitor. The sections below follow what happens after you hit the button to deliver that.

Defining a monitor

A monitor can be expressed three ways, and the flexibility is part of the product:

Exact keyword, using a small boolean language with AND/OR, phrases, and negation.
Semantic, matching on meaning rather than wording, so it catches paraphrases and related discussion an exact match misses.
Either, plus a semantic-context filter. You can take a broad keyword like Nike and keep only the mentions that are semantically about, say, running shoes, or narrow a semantic monitor further. This layering is what lets a monitor be both precise and broad at once.

An audience scopes any of these to chosen subreddits, or to all of Reddit.

Matching against the indexes

Previewing or saving a monitor searches roughly the last 30 days of Reddit for matches. Doing that in seconds for an arbitrary query is the central problem, and the reason it works is that matching never scans raw text. Recent Reddit content sits in Spanner behind two indexes built ahead of time: a full-text index over tokenized titles, selftext, and comment bodies, and an approximate-nearest-neighbour index over embeddings (Vertex AI text-embedding-005, 128 dimensions). A keyword monitor runs a full-text search; a semantic monitor embeds the query and does an ANN search, keeping anything within a cosine-distance threshold; a semantic-context filter stacks on as a second distance test.

This is not a search that returns a ranked page of ten. A monitor can have millions of matches over a month, and the product is built around interacting with all of them: filtering by subreddit, date, score, sentiment, or topic, sorting, paging through the threads, and reading charts that aggregate the whole set. So matching can't stop at the best few. It has to retrieve every match and persist it into a store the user can then slice and aggregate, which makes writing the results as much of a problem as finding them.

Querying the month in parallel

Rather than run one query over the whole month, the backfill breaks the window into day-sized slices and queries each independently, for two reasons. The obvious one is parallelism: a single month-long query is read out by one worker, whereas thirty day slices can be spread across workers for roughly thirty times the throughput. The subtler one is sampling correctness. When a query can return more matches than its result cap, which ones come back depends on how the index happens to return rows, not on real volume, so a single capped query over the month would skew the day-to-day trend toward whichever days the storage favoured. Slicing by day makes each day's sample independent, so the time series reflects actual activity rather than storage order.

Slicing only pays off if each slice is cheap to isolate, though. Restricting a query to one day naively means filtering by timestamp, and neither a full-text nor an ANN index filters naturally by timestamp: bolting a timestamp predicate on pulls the query off the index, so each of the thirty slices would scan far more than its day and the parallelism win would invert into thirty times the compute. The way around it is that Reddit ids are monotonic with time, so a date range is just a contiguous range of integer ids, and those ids are already carried in both indexes. A slice can therefore be expressed purely as an id range: no timestamp predicate, no join, no secondary lookup, the index does the narrowing itself. The same property that keeps each day's statistics honest is what makes the thirty-way split genuinely cheaper instead of thirty times more expensive.

Streaming results as they arrive

Each day's posts and comments are queried in parallel, streaming the matches into a batched writer that commits them to Spanner continuously rather than collecting everything for one write at the end, so persistence keeps pace with retrieval. Recent slices are prioritised, so the last week is stored first and the dashboard populates within seconds while the older weeks keep filling in behind it. On the client this runs over a streaming websocket connection that pushes partial results as they land, so time-to-interactivity is well ahead of time-to-completion: the user is reading and filtering the most recent matches while the rest of the month is still being retrieved.

Bounding large monitors with sampling

Persisting and analysing every match is what makes the results interactive, but some monitors match far more than is useful, or even safe, to run. A keyword as common as reddit matches somewhere between tens and hundreds of millions of comments a month; returning all of them is pointless when the trends and themes are clear from a small fraction, and the query itself would overwhelm the database. A stopword list doesn't solve this, because there is no fixed set of "too common" terms: the real volume falls out of how a keyword, an audience, and a time range combine, so whether a given monitor is dangerous can't be judged from the query text alone.

The catch is that we have to decide whether to sample before running the real query, not after, or we are back to executing the very query we were trying to avoid. Even counting the matches is unsafe for a term like reddit, so the estimate is itself sampled: the backfill counts matching comments over the last 24 hours through the same hashed-id filter fixed at 5%, scales the count back up to estimate the day's volume, and compares it to a target of 500 comments a day. Under the target the monitor runs unsampled; over it, the rate is set to divide the volume down to the target, then applied to the full 30-day query for both posts and comments, so the real run never materialises the full set.

The sampling reuses the same idea as the date-range trick. It is a deterministic hash on the integer id evaluated inside the index query, so the index still does the work and no extra lookup is needed. Because it is deterministic and keyed on the id, the same items are always selected and any other process can reproduce the exact same selection with no coordination. The chosen rate is stored on the monitor, and the dashboards divide their counts back up by it, so the user sees estimated true volumes while only a bounded, representative sample is ever stored or analysed.

Keeping it live

The backfill only covers the past. The moment a monitor is saved it has to keep matching new posts and comments as they are published, and those live results have to line up with the backfilled ones or the trend chart would jump at the seam.

Matching runs as a pool of replicas that share the live reddit-submissions and reddit-comments firehoses, with the streams divided across replicas so throughput scales by adding more of them. What keeps it affordable is that the cost that grows is the content volume, not the number of monitors. The expensive step, embedding an item, happens once per post or comment no matter how many monitors exist, and is cached so repeated text is free. Every active monitor is then tested against that one embedding. Each replica keeps all active monitors in memory, refreshed whenever one changes, so a test is a cheap in-process comparison rather than a query, and adding a monitor adds a comparison rather than a search.

Each test is the same computation the backfill runs, in the same order: the FarmHash sampling on the content id, the audience check, then the matcher, a boolean keyword evaluation through a small custom query interpreter (RQuery) or a cosine-distance comparison against the precomputed query vector for semantic monitors. Because the sampling and the matching are identical to the backfill's, an item the backfill would have dropped is dropped live too, and matches flow into the same ListenerMatches table and ultra-matches queue, indistinguishable from backfill output. That is what lets history and live form one continuous series.

Analysing every match

Once a match is found, an LLM labels it. The label is relative to the monitor's subject, not the content by itself: the same comment can be positive for one monitor and negative for another, so it is scored against the keywords and semantic context that produced the match. A worker reads the full text from Bigtable, including the parent post when the match is a comment, pairs it with the subject, and sends batches of about ten to Gemini 2.5 Flash Lite at temperature zero with a fixed JSON schema. The model returns a sentiment (positive, negative, or neutral), whether the content is really about the subject, and a list of topics.

The LLM step is much slower and more expensive than matching, so it is kept off the matching path. A match is written and shows up in the counts, threads, and source breakdowns right away, and its sentiment and topics are filled in afterwards by workers reading the shared ultra-matches queue. The dashboard tracks how much of a monitor has been analysed so far and fills the sentiment and theme views in as results land, instead of waiting for the whole set.

A new monitor can drop a month of matches onto the queue at once, so analysis is ordered by recency. Matches from the last seven days go on a high-priority queue and the rest on a low-priority one, and workers always clear the high-priority queue first. With the newest-first retrieval, that means the recent window most users look at is matched and analysed before the deeper history. Matches belonging to a monitor that has since been re-run or cancelled are dropped instead of analysed, so superseded versions don't waste model calls.

In summary

Ultra was the product of a lot of big performance wins stacked on top of eachother. Together they helped to deliver the fastest, most powerful social listening tool available. It became a foundational technology powering several products at GigaBrain including GigaBrain Shopping's product intelligence.