Subharvest is a Reddit research Chrome extension for voice-of-customer work. It helps users discover Reddit threads, extract comments with context, cluster pains and objections, and export evidence-backed research data.

Can Subharvest export Reddit comments to CSV?

Yes. Subharvest exports Reddit comments to CSV, JSON, and Markdown with fields such as author, score, depth, permalink, timestamp, parent ID, and body text.

How is Subharvest different from generic comment scrapers?

Subharvest is focused on Reddit research workflows instead of broad multi-platform scraping. It combines subreddit discovery, thread scraping, local signal clusters, evidence links, and research-ready exports.

Is Subharvest local-first?

Yes. The extension is designed as a local-first workflow. Scraped data can be reviewed and exported from the browser before any optional paid sync or AI feature is added.

Reddit voice-of-customer research

Reddit Research With Receipts.

Discover Reddit threads, extract comment context, and cluster pains, objections, requests, and buying triggers with source links.

Get Early Access View Plans

Status: Operational

Stop mining raw data. Start harvesting customer language.

Free handles clean exports. Hobby gives solo researchers more room. Pro turns a niche into repeatable research. Research keeps market intelligence updated.

Core Systems

A research workbench, not a raw exporter.

Market Discovery

Start with subreddits, feeds, scores, and keywords. Move beyond one-off thread scraping into repeatable market research.

Evidence Clusters

Group raw comments into pains, requests, objections, failed solutions, buying triggers, and competitor mentions.

Research Exports

Export clean CSV, JSON, and Markdown evidence packs built for spreadsheets, briefs, docs, and LLM prompts.

Workflow

From subreddit to sourced insight.

Step 01

Discover

Point Subharvest at a niche. Find high-signal Reddit threads across selected subreddits before you scrape anything.

Step 02

Extract

Capture comment trees with context: parent IDs, depth, submitter flags, timestamps, scores, authors, and permalinks.

Step 03

Prove

Turn clusters into a research brief where every insight links back to the original Reddit evidence.

Output Preview

Evidence packs, not mystery summaries.

r/SaaS - Pain cluster3 source comments

"Most scraping tools lose the nested context. If I cannot see what they were replying to, I cannot validate the pain point."

Cluster: Pain points, requests, objections
Fields: parent_id, depth, OP flag, score, timestamp, permalink
Output: Markdown evidence pack plus CSV and JSON

Comparison

Win on Reddit depth first.

Parameter

Broad scraper

Subharvest

Core job

Export comment text

Build evidence-backed Reddit research

Discovery

Manual thread hunting

Subreddit discovery and bulk queues

Context

Flattened comments

Parent IDs, depth, OP flags, links

Analysis

Generic summaries

Pains, requests, objections, triggers

Exports

CSV download

CSV, JSON, Markdown evidence packs

Privacy

Account or cloud workflow

Local-first by default

Pricing

Pricing tied to research value.

Early-access builds expose the local MVP while Stripe checkout, account login, and tier gates are prepared.

Solo researchers

Hobby

$48/yr

Save 20% yearly

Small queues plus saved context

Subreddit discovery
Small research queues
Saved local projects
Markdown evidence packs
Light run history
Spreadsheet-ready exports

Choose Hobby

Recommended

Most useful

Pro

$72/yr

Save 33% yearly

The complete repeat research workflow

Larger bulk queues
More saved projects
Cross-run dedupe
Run comparison
Airtable and Notion presets
Project evidence exports

Choose Pro

Market monitoring

Research

$120/yr

Save 33% yearly

Recurring tracking and Sheets workflow

Scheduled refresh (planned)
New since last run
Google Sheets sync
Competitor mention tracking
BYO-key AI summaries (planned)
Larger queues and history

Choose Research

FAQ

System Queries.

How does the extraction work?

Subharvest reads Reddit JSON from your browser context, preserves comment metadata, and keeps source links attached to every exported row and insight.

What metadata is preserved?

We keep everything: comment depth, parent IDs, author flags (OP, Mod, Admin), exact Unix timestamps, and full permalinks.

Is there an AI layer?

The core engine is deterministic and local-first. AI summaries should sit on top of source-linked evidence packs, with a BYO-key option planned for the Research tier.

Are paid tiers live?

Not yet. Free, Hobby, Pro, and Research define the planned Stripe gates. The current early-access build focuses on validating the local Reddit research workflow before payments are enforced.

Ready for input

Build research from real Reddit language.

Get Early Access