Skip to content
Open-Source · Self-Hosted · MIT License

Stop reading.Start acting.

Social Inference Engine monitors 13 platforms, classifies every post into 10 business-intent signal types using calibrated LLM inference, and delivers a prioritised action queue — running entirely on your machine.

593 tests passing·MIT License·Python 3.11 · FastAPI · pgvector
Signal Queue — Live
streaming
churn_riskURGENT
Reddit·2m ago
0.87
competitor_weakness
Twitter/X·8m ago
0.91
lead_opportunity
LinkedIn·15m ago
0.79
support_escalationURGENT
Twitter/X·23m ago
0.94
feature_request_pattern
Reddit·31m ago
0.72
5 signals · updated 2s ago
SSE active

Signal queue view — confidence-ranked, with verbatim evidence

13
Platform connectors
10
Business signal types
593
Tests passing
< 13 µs
Per deduplication check
100%
Local — no data egress

Your team is drowning in social noise.The signals that matter are buried.

B2B teams responsible for reputation, sales, and product spend hours each week manually scrolling Reddit threads, YouTube comment sections, and news feeds — searching for the posts that contain an actual signal: a prospect who just complained about a competitor, a customer whose support issue went public, a feature request pattern forming across dozens of independent posts.

Existing social listening tools surface volume, not intent. They count mentions and track sentiment. They do not tell you what to do.

Hours lost to manual triage

A mid-size B2B team spends 3–5 hours per week reading posts that contain no actionable signal — only to miss the churn risk that surfaced in a subreddit no one was watching.

Signal buried in sentiment scores

A post can be negative in sentiment and completely irrelevant to your business. It can be neutral in tone and contain an explicit statement of intent to switch vendors. Sentiment is the wrong unit of measurement.

SaaS tools with your data

Every SaaS listening tool requires you to pipe your customer conversations, competitor mentions, and product feedback through a third-party server. For regulated industries, this is a compliance problem. For everyone else, it is an unnecessary risk.

From raw post to structured action in six steps

Every observation that enters the system passes through a deterministic, auditable pipeline. No black boxes. Every classification decision comes with a structured rationale and verbatim evidence spans.

1

Ingest

Celery workers fetch content from every configured platform connector on a 15-minute schedule. New URLs are checked against a Bloom filter before any processing begins — eliminating duplicate work without a database query.

Bloom filter · O(1) deduplication · 13 µs per URL check
2

Sample

When a platform returns more content than the fetch budget allows, a reservoir sampler draws a statistically unbiased sample. Every item in the stream has an equal probability of inclusion, regardless of total stream length.

Reservoir sampling · O(n) · ~1,000 items/ms throughput
3

Normalise

Raw observations from 13 different platforms — each with its own schema, encoding, language, and media type — are transformed into a unified NormalizedObservation. PII is scrubbed. Non-English text is detected and flagged for translation.

NormalizationEngine · DataResidencyGuard · spaCy NER
4

Retrieve

A candidate retrieval step finds semantically similar past observations using pgvector cosine similarity search. The top-k results are assembled into a few-shot context window passed to the LLM.

pgvector · 1536-dim embeddings · HNSW approximate nearest neighbour
5

Classify

The LLM Adjudicator classifies the observation against the 10-type signal taxonomy. Frontier-tier signals route to GPT-4o. The remaining 7 types route to a fine-tuned smaller model or a local Ollama model — reducing per-signal LLM cost without accuracy regression.

LLMRouter · two-tier routing · calibrated confidence · abstention
6

Rank and Deliver

Classified signals are scored across three dimensions: opportunity, urgency, and risk. The ActionRanker produces a composite priority score. The ranked queue is available via REST API, and new signals are pushed via Server-Sent Events.

ActionRanker · composite priority score · SSE streaming

10 business-intent signal types. Not 10,000 sentiment buckets.

Every classified observation maps to exactly one signal type from the taxonomy below. The classifier abstains — and tells you why — when the evidence is insufficient.

REVENUE OPPORTUNITIES
lead_opportunity

Lead Opportunity

A prospect publicly expressing dissatisfaction with a competitor, requesting alternatives, or describing a pain point your product solves.

Action:DM Outreach
"We've been using [Competitor] for two years and the pricing just got unbearable. Looking for alternatives in the comments."
competitor_weakness

Competitor Weakness

Public criticism, outage reports, or recurring complaints directed at a competitor that represent a window to position your product.

Action:Create Content
"[Competitor] has been down for 3 hours. This is the fourth time this quarter. I'm done."
influencer_amplification

Influencer Amplification

A post by a high-reach account that mentions your brand, category, or a topic you can credibly enter. Time-sensitive.

Action:Reply Public
[YouTube creator with 420k subscribers] "I switched my entire agency workflow to this tool — here's why."
RISK SIGNALS
churn_risk

Churn Risk

An existing customer or user expressing frustration, threatening to cancel, or comparing your product unfavourably. Routes to the frontier LLM tier.

Action:Internal Alert → DM Outreach
"Three bugs in two weeks and support hasn't replied. I'm moving our team off [Product] this Friday unless something changes."
misinformation_risk

Misinformation Risk

Factually incorrect claims about your product or company that are spreading in public forums. Each hour of delay increases the amplification.

Action:Reply Public
"[Product] was acquired by [Wrong Company] last month and they're shutting it down." — spreading in a 15k-member Slack community.
support_escalation

Support Escalation

A support issue that has escaped private channels and is now playing out publicly on Twitter/X, Reddit, or a tech forum.

Action:Reply Public + Internal Alert
"@[Product] — I've opened three tickets in 10 days and nobody has responded. Posting here since I have no other options."
PRODUCT SIGNALS
product_confusion

Product Confusion

Posts that reveal a fundamental misunderstanding of what your product does, how it works, or how it is priced.

Action:Create Content
"Wait — [Product] doesn't support [Feature]? I thought that was the whole point. We bought it specifically for that."
feature_request_pattern

Feature Request Pattern

A recurring request for a specific capability appearing across multiple independent posts over a rolling window.

Action:Monitor → Internal Alert
Cluster of 14 posts over 3 weeks across Reddit and Twitter all requesting native CSV export with custom date ranges.
launch_moment

Launch Moment

A product launch — yours or a competitor's — generating significant public discussion. Includes pre-launch leaks and post-launch reactions.

Action:Create Content + Reply Public
"[Competitor] just launched [Feature] in beta. This is what everyone in our space has been waiting for."
CONTENT OPPORTUNITIES
trend_to_content

Trend to Content

A rising conversation, topic, or question in your market that your team is credibly positioned to address with content.

Action:Create Content
Rapid growth in discussions about [Technical Topic] across Hacker News and multiple engineering subreddits over the past 72 hours.

When confidence is insufficient for a reliable classification, the model abstains and returns a structured abstention reason (e.g., ambiguous_intent, insufficient_context, out_of_scope). Abstentions never surface in the signal queue.

13 connectors. Social, news, and community — all in one queue.

Every platform connector implements the same interface. Adding a new source takes one API credential and one configuration object. No custom ETL pipeline required.

🟠
Reddit
Social
Posts, comments, subreddit streams
OAuth 2.0
🔴
YouTube
Social
Video comments, channel posts
Google API key
TikTok
Social
Video comments, creator posts
TikTok Developer App
🔵
Facebook
Social
Page posts, public groups
Meta Developer token
🟣
Instagram
Social
Post captions, comments
Meta Developer token
🟢
WeChat
Social
Official account articles
WeChat Open Platform
🟡
RSS
Generic
Any RSS 2.0 or Atom feed
None — feed URLs
New York Times
News
All section feeds
None — public RSS
📰
Wall Street Journal
News
All section feeds
None — public RSS
🔵
ABC News (US)
News
Top stories, tech, business
None — public
🔵
ABC News Australia
News
Top stories, technology
None — public
🔴
Google News
News
Top stories by topic
None — scrape
Apple News
News
Top stories by topic
None — scrape

Need a platform that isn't listed?

The connector interface is documented. Adding a new connector requires implementing one abstract class with three methods: authenticate(), fetch(), and validate_credentials().

Read the connector guide
SHA-256 pseudonym ✓
PII scrubbed ✓
Audit log ✓

Your data never leaves your machine.

Social Inference Engine enforces a zero-egress contract at the application boundary. The DataResidencyGuard intercepts every LLM call and verifies that no raw personal data is present in the prompt before it is dispatched — and writes an immutable audit log entry for every redaction it makes.

Author pseudonymisation

Author handles and user IDs are replaced with deterministic SHA-256 pseudonyms before any text is assembled into an LLM prompt. The mapping is stored only in your local database.

PII scrubbing at the call boundary

Email addresses, phone numbers, and identifying URL parameters are removed from observation text before prompt assembly. A secondary verify_clean() check runs immediately before the API call.

Immutable audit log

Every redaction generates a structured audit log entry with the redaction type, a hash of the original value, and a timestamp. The log is append-only and stored in your local PostgreSQL instance.

Full offline operation with Ollama

Configure LOCAL_LLM_URL=http://localhost:11434 and LOCAL_LLM_MODEL=llama3.1:8b to route all classification inference to a local Ollama instance. No observation text ever reaches an external network.

“Social Inference Engine is designed for deployment in environments where cloud AI providers are prohibited by compliance policy. The privacy architecture was not retrofitted — it is structural.”

Two-tier inference. Frontier accuracy where it matters.Fine-tuned efficiency everywhere else.

Social Inference Engine routes each observation to one of two LLM tiers based on the signal type being evaluated. The routing decision is deterministic — no sampling, no probabilistic routing.

Tier 1 — Frontier

Models
GPT-4oClaude 3.5 Sonnet
Signal Types
churn_riskmisinformation_risksupport_escalation

These three types carry the highest cost of a false negative. Frontier accuracy is non-negotiable.

1.5 – 4 s per signal

Tier 2 — Non-Frontier

Models
GPT-4o mini (fine-tuned)Ollama llama3.1:8b (local)
Signal Types
lead_opportunitycompetitor_weaknessinfluencer_amplificationproduct_confusionfeature_request_patternlaunch_momenttrend_to_content

7 types with lower cost of error respond well to fine-tuning. 70–80% cost reduction with no measurable accuracy regression.

0.4 – 1.2 s (fine-tuned) · 3 – 12 s (local)

85–90% cost reduction on non-frontier volume

At 1,000 signals per day, routing 70% to Tier 2 at GPT-4o mini pricing versus routing all to GPT-4o represents a cost reduction of approximately 85–90%. The exact saving depends on your provider pricing at the time of deployment.

Provider Configuration

ProviderConfig KeyNotes
OpenAI GPT-4oOPENAI_API_KEYFrontier tier (required)
OpenAI fine-tuneFINE_TUNED_MODEL_IDNon-frontier tier (recommended)
Anthropic ClaudeANTHROPIC_API_KEYAlternative frontier tier
Ollama (local)LOCAL_LLM_URL + LOCAL_LLM_MODELZero-cost, zero-egress
vLLM (self-host)VLLM_ENDPOINTHigh-throughput self-hosted

The system gets more accurate the more you use it.

LLMs are systematically miscalibrated: they tend toward overconfidence on common signal types and underconfidence on rare ones. A model that outputs confidence = 0.91 for every churn_risk does not have 91% accuracy.

Social Inference Engine applies per-signal-type temperature scaling, calibrated on the 107-example seed dataset and updated online after every analyst feedback event. A confidence score of 0.87 means: in the training distribution, 87% of signals classified at this confidence level were correct.

Temperature scalars are updated in-process in 6–8 microseconds per feedback event. There is no retraining cycle. The first correction improves subsequent classifications immediately.

Fine-tuning targets (non-frontier tier)

≥ 0.82
Macro F1
≤ 0.05
Expected Calibration Error
≤ 0.08
False-action rate
5–15%
Abstention rate

Temperature Scalars

calibrated on 107 examples
lead_opportunitycalibrated
18 samples
0.92
competitor_weaknesscalibrated
14 samples
0.88
influencer_amplificationcalibrated
9 samples
1.05
churn_riskcalibrated
21 samples
0.79
misinformation_riskcalibrated
11 samples
0.85
support_escalationcalibrated
15 samples
0.83
product_confusioncalibrated
8 samples
1.08
feature_request_patterncalibrated
6 samples
0.97
launch_momentcalibrated
3 samples
0.94
trend_to_contentcalibrated
2 samples
1.12

Temperature = 1.0 means uncalibrated. Values < 1.0 reduce overconfident outputs. Values > 1.0 sharpen underconfident outputs.

Measured. Not estimated.

Every number below comes from running deliverables/benchmark.py with 3 warm-up passes and 7 timed repetitions on Apple M-series hardware. No simulations. No projections.

BloomFilter.add() — Time vs. Input Size

Per-operation cost
12–13 µs (constant)
Memory model
O(n) bits
False positive rate
1% (configurable)
False negative rate
0% (guaranteed)

Built for teams. Not inboxes.

Social Inference Engine's signal queue is a shared workspace. Every action, assignment, and dismissal is timestamped, attributed to the acting user, and available to the whole team.

Role-based queue management

Three roles — VIEWER, ANALYST, MANAGER — control what each team member sees and can do. Managers see all signals including those assigned to other analysts. Viewers see the queue but cannot act. Analysts can act, dismiss, and submit feedback.

MANAGERAll signals5 signals
ANALYSTAssigned3 signals
VIEWERRead only

Real-time signal delivery via SSE

New signals are pushed to connected clients over a persistent Server-Sent Events connection as soon as they are classified — no polling, no webhook setup, no third-party push service. Connect with a single curl command.

Terminal
$ curl -N -H "Authorization: Bearer $TOKEN" \
-H "Accept: text/event-stream" \
http://localhost:8000/api/v1/signals/stream
data: {"type":"churn_risk","confidence":0.87}
data: {"type":"lead_opportunity","confidence":0.79}
streaming...

Online calibration — the queue gets better with every correction

When the model misclassifies a signal, submit a correction via the API or the web UI. The ConfidenceCalibrator performs one gradient-descent step immediately — adjusting the temperature scalar for the corrected signal type without any service restart. Calibration improvements are visible in subsequent inferences within the same session.

Feedback submission
feature_requestchurn_risk
1 update · 6 µs
✓ Temperature scalar adjusted immediately

Running in 5 minutes on any machine.

Social Inference Engine runs on macOS, Ubuntu, and Windows WSL2. No cloud account required. No SaaS sign-up. No data ever leaves your machine unless you configure an external LLM provider.

Option A — Docker Compose

Starts the full stack in one command. Recommended.

# Step 1: Clone
git clone https://github.com/Shengboj0324/Inference-Engine.git
cd Inference-Engine
# Step 2: Generate secrets and configure
cp .env.example .env
python3 -c "import secrets; print('SECRET_KEY=' + secrets.token_urlsafe(32))"
# Paste values into .env, then add your OPENAI_API_KEY
# Step 3: Start everything
docker compose up
# Step 4: Run initial calibration
docker compose exec api python training/calibrate.py --epochs 5
# Step 5: Verify
curl http://localhost:8000/health
Expected output:
Health check passed
{"status": "healthy", "database": "ok", "redis": "ok"}

Option B — Bare-metal

macOS / Ubuntu

Run the application process directly — useful for debugging and IDE integration.

Full installation guide

System Requirements

MinRecommended
CPU4 cores8+ cores
RAM8 GB16 GB
Disk10 GB30 GB (model weights + data)
Python3.93.11
Docker24.0+ (Compose v2)Docker Desktop 4.28+
OSmacOS 12, Ubuntu 20.04, WSL2macOS 14 / Ubuntu 22.04

Connect to your own instance

Deploy Social Inference Engine locally, then set NEXT_PUBLIC_API_BASE_URL to connect this page to your running instance.

🚀

Set up your instance to enable the demo

This demo panel connects to a real Social Inference Engine API. Deploy your instance using Docker Compose (5 minutes), then set NEXT_PUBLIC_API_BASE_URL=http://localhost:8000 in your .env.local and restart the dev server.

# .env.local (website root)
NEXT_PUBLIC_API_BASE_URL=http://localhost:8000

# Then restart the dev server
npm run dev
Deployment Guide →

Frequently asked questions