Back to Data Schema

Data Collection Methodology

How we collect, classify, and maintain community reports

Source Communities

We monitor the following public communities for peptide-related reports:

Reddit

r/Peptides, r/Semaglutide, r/Tirzepatide, r/Nootropics, r/Biohackers, r/PeptidesForWeightLoss, r/SARMs, r/moreplatesmoredates

Forums

Longecity.org, IronMag Forums, Professional Muscle, Mens Health forums

Self-experimenter networks

Public biohacking blogs and documented self-experiment threads

Eligibility Criteria

A post qualifies as a report if it meets ALL of the following:

  • References a specific named peptide or compound
  • Includes a first-person reported experience (benefit, side effect, dosing, or neutral observation)
  • Is not promotional, sponsored, or vendor-affiliated
  • Contains enough context to assign a confidence score of 1 or higher

Classification Process

1

Keyword detection

Flags candidate posts matching peptide entity list

2

Human review queue

For low-confidence classifications

3

Category assigned

Benefit / Side Effect / Dosing / Neutral

4

Subcategory tagged

From controlled vocabulary (e.g. "recovery", "sleep", "nausea")

5

Sentiment scored

Positive / Negative / Neutral

6

Confidence score assigned

1 (anecdotal mention) / 2 (detailed account) / 3 (structured self-experiment)

7

Source URL recorded

Direct link to original post preserved and linked

Update Frequency

The dataset is updated on a rolling basis. New reports are processed weekly. The peptide count table on /data-schema reflects cumulative totals since platform launch.

Conflict Resolution

When two reports conflict (e.g. one reports benefit, one reports side effect for the same peptide), both are retained independently. We do not adjudicate or weight conflicting reports. Users see the full distribution.