How a tool to map computer viruses came to power biology research

By Edward Chen Sept. 6, 2022

Reprints

Adobe

When mathematicians Leland McInnes and John Healy walked into their work’s annual “Big Dig” — a sort of classified hackathon for Canada’s version of the National Security Agency — in 2017, they were not thinking about biology at all. They wanted to find a way to quickly spot the differences between computer viruses.

They ended up creating a tool to simplify datasets and visualize the data points in them: an algorithm they named Uniform Manifold Approximation and Projection, or UMAP. They published a paper on it in 2018. To their great surprise, in fewer than five years, it has become one of the most ubiquitous tools in modern biology research. UMAP has now been used to study everything from forecasting rain in the Alps to identifying the many-hued pigments in a Gauguin artwork to modeling how Covid-19 tweets are disseminated. And, of course, scientists have applied UMAP to studying the actual virus itself. The technique is now the method of choice for most computational biologists who want to see what, exactly, is going on in a dataset.

“Almost every paper is going to have a UMAP in figure one,” said John Marioni, a group leader at the European Bioinformatics Institute and a faculty member at the Wellcome Sanger Institute. “I would say it’s almost become standard in the analysis. There are a few alternative visualizations, but, in general, the first figure in most papers, it’s going to be generated using the UMAP algorithm.”

STAT+ Exclusive Story

This article is exclusive to STAT+ subscribers

Unlock this article — plus in-depth analysis, newsletters, premium events, and networking platform access.

Already have an account? Log in

Monthly

$39

Totals $468 per year

$39/month Get Started

Totals $468 per year

Starter

$30

for 3 months, then $39/month

$30 for 3 months Get Started

Then $39/month

Annual

$399

Save 15%

$399/year Get Started

Save 15%

11+ Users

Custom

Savings start at 25%!

Request A Quote Request A Quote

Savings start at 25%!

2-10 Users

$300

Annually per user

$300/year Get Started

$300 Annually per user

View All Plans

Get unlimited access to award-winning journalism and exclusive events.

About the Author Reprints

Edward Chen

Edward Chen is a freelancer writer. He was previously a STAT intern and AAAS Mass Media Fellow.

@edwrdchen

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

You've been selected!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

Get Unlimited Access!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

You've been selected!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

Get Unlimited Access!

How a tool to map computer viruses came to power biology research

This article is exclusive to STAT+ subscribers

Unlock this article — plus in-depth analysis, newsletters, premium events, and networking platform access.

About the Author Reprints

Edward Chen

Tags

Recommended

Recommended Stories

Statehouses are a hotbed of tobacco lobbying and legislation

New Medicaid rule expected to lower wait times for home-based care, raise caregiver wages

STAT Plus: Brain biopsies on ‘vulnerable’ patients at Mount Sinai set off alarm bells at FDA, documents show

STAT Plus: New, serious safety risk related to MorphoSys’ cancer drug complicates, potentially threatens, Novartis acquisition

STAT Plus: Generative AI is supposed to save doctors from burnout. New data show it needs more training

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

You've been selected!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

Get Unlimited Access!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

You've been selected!

Your go-to source

Join the conversation

Stay ahead

Make data-driven decisions

Get Unlimited Access!

This article is exclusive to STAT+ subscribers

Unlock this article — plus in-depth analysis, newsletters, premium events, and networking platform access.

About the Author Reprints

Edward Chen

Tags

Trending

Recommended

Recommended Stories

Statehouses are a hotbed of tobacco lobbying and legislation

New Medicaid rule expected to lower wait times for home-based care, raise caregiver wages

STAT Plus: Brain biopsies on ‘vulnerable’ patients at Mount Sinai set off alarm bells at FDA, documents show

STAT Plus: New, serious safety risk related to MorphoSys’ cancer drug complicates, potentially threatens, Novartis acquisition

STAT Plus: Generative AI is supposed to save doctors from burnout. New data show it needs more training