The case for owning your genetic data

HealthData Analysis
Nov 1, 2025

The quantified self movement, launched by Wired editors Gary Wolf and Kevin Kelly in 2007, promised self-knowledge through measurement. Track your steps, sleep, heart rate, calories—the data would reveal patterns, the patterns would suggest interventions, and you'd optimize your way to better health.

The movement largely failed. Not because the premise was wrong, but because the execution was impossible. Participants drowned in data without context. Ecosystems fragmented into proprietary silos. The cognitive overhead of maintaining a personal analytics practice exceeded what most people could sustain.

The databases that make genomics interpretable

Raw genetic data is just a list of variants. What makes it meaningful is interpretation—connecting those variants to what we know about human biology and disease.

Two databases do most of this work:

ClinVar, maintained by the NIH's National Center for Biotechnology Information, aggregates clinical interpretations of genetic variants. Researchers, clinical labs, and expert panels submit findings: variant X has been observed in patients with condition Y, with this level of evidence. ClinVar currently contains 3.8 million variant entries, with nearly 500,000 classified as pathogenic or likely pathogenic.

PharmGKB (Pharmacogenomics Knowledge Base) focuses specifically on how genetic variants affect drug response. The FDA now includes pharmacogenomic information on over 600 drug labels, some requiring or recommending genetic testing before prescription. PharmGKB catalogs these gene-drug relationships with evidence levels.

Both databases are public. Anyone can download them. The scientific community maintains them precisely because this knowledge shouldn't be locked away.

Why pharmacogenomics is clinically actionable

Most genetic "risk factors" are probabilistic nudges—a variant might increase your lifetime risk of some condition from 10% to 12%. Useful for research, less useful for individual decisions.

Pharmacogenomics is different. These are genes that directly affect how you metabolize specific drugs:

CYP2C19 affects metabolism of clopidogrel (Plavix), a blood thinner prescribed after heart attacks. Roughly 30% of people carry variants that make the drug less effective. The FDA label recommends considering alternative therapy for poor metabolizers.

CYP2D6 affects codeine, tramadol, and many antidepressants. Some people can't convert codeine to morphine (no pain relief). Others convert it too rapidly (overdose risk). The FDA has issued multiple safety warnings.

VKORC1 and CYP2C9 together affect warfarin dosing. Patients with certain variants need dramatically different doses. The FDA provides a genotype-guided dosing table.

This isn't speculative. These are real clinical decisions where knowing your genotype matters.

The problem with uploading your data

Services like Promethease will cross-reference your genome against these databases for $12. They download ClinVar, run the comparison, format a report. It's convenient.

But you're uploading your genetic data to another company's servers.

In 2023, hackers stole genetic data for 7 million 23andMe users. The company is now in bankruptcy, and 27 US states have filed suit to prevent the sale of customer genetic data in the proceedings.

Genetic data has a different threat model than other personal information. If your password leaks, you change it. If your genetic data leaks, it's permanent. It can identify your relatives. It reveals predispositions you might not want employers or insurers to know.

The alternative: run the analysis yourself

The analysis these services perform isn't proprietary. Download ClinVar. Download PharmGKB. Cross-reference your 600,000 variants against the databases. Output matches.

This is exactly what bioinformaticians do. The tools exist: Python, SQLite, standard data processing. The databases are public. The algorithms aren't secret.

The barrier is labor. Setting up the pipeline, downloading and parsing the databases, writing the queries, formatting output—it takes hours of work and assumes familiarity with command-line tools and scripting.

This is where having your own compute environment matters.

Infrastructure for personal bioinformatics

The quantified self movement failed partly because it required too much ongoing effort. You had to track, sync, analyze, interpret—every day, indefinitely.

Genomics is different. You run the analysis once. Your genome doesn't change. What changes is our understanding of it—new variants get classified, new gene-drug interactions get discovered. ClinVar updates weekly.

What you want is infrastructure: a place where your genetic data lives, where the reference databases are maintained, where you can run analyses now and re-run them as knowledge evolves. A personal bioinformatics environment.

On Zo, getting started is simple: upload your 23andMe data, then run our genomics analysis prompt. The system downloads ClinVar and PharmGKB, builds indexed databases, cross-references your variants, and generates a report.

Your data stays on your server. And because there's an AI interface, you can ask questions about your data, in natural language: "What variants do I have that affect statin metabolism?" "Show me anything classified as pathogenic in ClinVar."

You end up with something closer to what serious researchers have: a queryable genomics environment, not a one-time PDF.

Getting started

First, download your raw data from 23andMe.

Then, with Zo, upload the file and run our personal genomics analysis prompt.

The output is yours — databases you control, on infrastructure you control, queryable by AI that works for you.