Article originally published by The Peak
By Robert Delaney

WuXi NextCODE: the Google for Genomic Data

An in-depth feature published by Hong Kong’s premiere business and culture magazine, The Peak, about how WuXi NextCODE serves as "the Google of genomic data."

After years of inertia, the power of genomic data is being put to work on new diagnostic tools that promise to revolutionize medicine. Wuxi NextCODe, a genomic data company spanning Europe, the US and China, is leading the race to become the equivalent of Google in medical research.


Researchers working together across the globe identified and mapped the human genome in 2003, in a project that was hailed as a breakthrough for medicine and as a possible panacea for many illnesses. And then the promise of genomics seemed to fade away.

Greater insight into the human genome, which contains all the information necessary to build a living organism, would not revolutionize medicine until someone developed the technology needed to store and analyze genomic data quickly.

About 15 years after work on the Human Genome Project was completed, the average diagnosis for an unknown and rare disease, for example, is seven years. Some individuals suffering from such illnesses – 7 percent of the world’s population – deteriorate physically or die within that time frame.



WuXi NextCODE (WuXi NextCODE), a genomic informatics pioneer based in Shanghai, Cambridge, Massachusetts, and Reykjavik, Iceland, has emerged as one of the companies leading the charge in helping big pharma and others in the healthcare industry fulfill the project’s original aspirations.

The company, acquired by Shanghai and US-based WuXi AppTec in 2015 and merged with AppTec’s Genome Centre in Shanghai, leverages a proprietary platform that can cut the “diagnostic odyssey” for someone with a rare disease to a matter of hours.

The company’s Chief Executive Officer, Hannes Smarason, says WuXi NextCODE’s tool does for medical researchers and clinicians what Google’s search engine did for internet users nearly two decades ago.

Hannes Smarason WuXi NextCODE CEO
Hannes Smarason, WuXi NextCODE CEO

WuXi NextCODE’s tool does for medical researchers and clinicians what Google’s search engine did for internet users nearly two decades ago.

Each human genome has 3.2 billion “bases,” or bits of information, requiring about 150 gigabytes of storage. Analyzing thousands of genomes, often the only way to produce diagnoses and targeted therapies, is time-consuming or impossible on databases not designed to handle such volumes.

The architecture of WuXi NextCODE’s “genomically ordered relational database,” or GORdb, was designed specifically to store and analyze huge amounts of genetic code and index new genetic variations that appear in the hundreds of thousands of genomes sequenced by the company, its partners and in every public data resource.

“For the first time, technology has really come together in a unique way to enable an entirely new industry around low-cost data generation on people’s sequences,” Smarason said in an interview with The Peak.

“Together with the computational power and the emergence of artificial intelligence [AI], to then make sense of and use that information for a variety of purposes is really opening up this market now to significant disruption.”


WuXi NextCODE can point to several inspiring medical outcomes that might not have been possible without its platform.

In an early demonstration of its technology, one of WuXi NextCODE’s founders, Dr. Jeffrey Gulcher, helped to halt the progression of a syndrome that was gradually sapping the sight, hearing ability and abdominal strength of two young sisters in Iceland. Early attempts to identify the genetic roots of the illness failed because clinicians had only looked at known genetic variants that caused the sisters’ symptoms.

Gulcher and his colleagues at deCODE Genetics, from which NextCODE was spun out in 2013, used an early version of what is now WuXi NextCODE’s rare disease diagnostics tool to hone in on the gene and the variants in the gene causing the girls’ condition.

These capabilities enabled them to identify and understand the impact of the genetic variants the sisters inherited, a combination drawn from the DNA of each parent, which stunted the girls’ ability to absorb riboflavin.

Once the connection between riboflavin and the symptoms was understood, the girls were prescribed high doses of the vitamin, which halted further progression of their disease.

Last year, the Washington Post covered the story of a two-year-old boy suffering from seizures and breathing problems, who appeared to have a congenital condition that should have taken his life before his first birthday. A geneticist at Boston Children’s Hospital used the WuXi NextCODE platform and database to determine that the boy had a previously unknown genetic variation of the illness.

The diagnosis gave his doctors insight into what measures his parents would need to take – an oxygen mask to guard against sleep apnea, for example – to keep the child as healthy as possible.


“The difference between GORdb and other databases is that while they’re all relational databases, ours has a specific frame of reference – position on the genome – that facilitates the management of massive amounts of data on a genetic scale,” said Gulcher, who cofounded Reykjavik-based deCODE Genetics, from which NextCODE was spun out in 2013.

Jeff Gulcher, WuXi NextCODE executive
Jeffrey Gulcher, CSO of WuXi NextCODE

“Databases like Oracle were made for bank transactions, where you have maybe thousands of accounts, and each customer doing one thousand transactions per year.” The data challenge there, as Gulcher puts it, is a manageable one: a table with 2,000 rows and one column where the balance is updated after each transaction.

“By contrast, we have three billion columns to keep track of for each patient,” and the need to keep up to date all that is known in global reference data about each letter in the genome. “That’s the difference.”


Gulcher worked closely with his long-time informatics colleague Dr. Hakon Gudbjartsson, now WuXi NextCODE’s head of informatics, to develop GORdb as the technical underpinning for deCODE’s continuing work.

It has become the only database that holds all of the world’s publicly available genome reference data, and in the same format as hundreds of thousands more being analyzed by WuXi NextCODE partner pharmaceutical companies, hospitals and governments looking to support the development of precision medicine and targeted therapies for specific populations.

Genomics England, for example, is a project set up by the UK government to sequence 100,000 genomes of National Health Service patients with rare diseases or cancers.

Other partnerships include an initiative with the National Heart Centre Singapore, in which WuXi NextCODE “will create an instantly queriable cloud-based enterprise data warehouse integrating large-scale whole genome sequences, medical and wearables data from both cardiovascular patients and healthy control subjects recruited by NHCS,” according to a joint statement by WuXi NextCODE and Singapore.

“This proof-of-concept resource will power research into novel genetic risk factors for heart disease in the Singapore and Southeast Asian populations more broadly,” the statement said.

WuXi NextCODE has also built partnerships with many of the world’s biggest pharmaceutical companies, including Novartis, Abbvie and Bristol-Myers Squibb, as well as medical institutions such as Boston Children’s Hospital and Peking Union Medical College Hospital. WuXi NextCODE works with all of them to mine data for correlations between symptoms and genetic variations.


All of the genetic code sequenced by WuXi NextCODE and its partners on the company’s platform has essentially created the world’s largest genetic “knowledge base.” This gives the company an edge in leveraging AI because the more data AI algorithms can trawl, the more accurate the results.

“I think NextCODE is more sophisticated than [other genomics algorithms available publicly] and might have some more elegant interfaces and data access capabilities and analysis capabilities, and so from that perspective it’s probably a pretty good platform,” said Alan Louie, research director for the life sciences practice at IDC, an independent market intelligence group based in Framingham, Massachusetts.

By applying AI to the genomic data WuXi NextCODE has amassed, the company is developing applications that may give doctors the ability to diagnose more prevalent illnesses with greater accuracy, and to pharmaceutical companies the ability to develop more effective treatments for them.

Pharmaceutical companies spend about US$150 billion (hk$1.170 trillion) a year worldwide on research and development. Smarason is betting that analysis done on WuXi NextCODE’s platform can replace up to US$10 billion of that total.

In one recent case study of its DeepCODE AI capabilities, WuXi NextCODE put the US National Cancer institute’s flagship Cancer Genome Atlas (TCGA) into GORdb and trained its algorithms on 8,000 tumor samples to identify genomic signatures that can instantly identify any of 22 major cancer types.

Pending clinical validation, a process designed to demonstrate that DeepCODE delivers a correct diagnosis to actual patients, WuXi NextCODE and its partners will begin to be able to use this capability to diagnose cancers, stratify clinical trials and inform the development of “liquid biopsies” for patients.

3 members of WuXi NextCODE's AI and deep learning team
Thomas W. Chittenden, VP of Statistical Sciences of WuXi NextCODE; head of AI and deep learning

Summing up the benefit of DeepCODE, Thomas Chittenden, founding director of the WuXi NextCODE Advanced Artificial Intelligence Research Laboratory, said: “Soon what we want to be able to do is take your blood and look at it, and if there’s any circulating DNA that’s been shed by a tumor, we’re getting to the point where we can identify the variety with 99.5 percent accuracy.”

By doing so, clinicians will be able to bypass the now-lengthy process of determining cancer types, which can have a high error rate because cancer cells often metastasize and grow into tumors in different parts of the body. In other words, a tumor found in a patient’s lung might have grown from cancer cells that originally manifested in the colon.

Chittenden’s group also ran a “pan-cancer survival” study that in its earliest version can predict with 75 percent accuracy the 60-month patient survival rate across 20 varieties of cancer.

“There must be something fundamentally in common across cancers that leads to more aggressive tumors,” Chittenden said. “Knowing what’s common across the non-survivors could lead to treatments for the harder-to-treat cancers.”


Having operations in China, the US and Europe allows WuXi NextCODE to broker cooperative arrangements between its partners. For example, clinicians at Boston Children’s Hospital can share insights with their counterparts at Children’s Hospital of Fudan University in the hunt for the genetic cause of an unknown syndrome. The larger the pool of data, the easier it becomes to draw a conclusion.

Smarason says he considers the company’s global footprint as one of its most important assets because it makes the knowledgebase on WuXi NextCODE’s platform more robust than any other company in the industry. He also says the company’s geography and knowledgebase means it has no competitors.

The answer to the competition question isn’t so straightforward.

“Regarding the use of the platform to be able to more quickly identify interesting insights, IBM makes exactly that same claim using Watson to track all of the literature and scientific research with identified genomic variants and biomarkers that then relate to appropriate treatment regimens for disease,” IDC’s Louie said.

“Clearly, they’re a major player. Clearly, they have some industry-leading tools that are supporting that. Are they the only ones in the front? Probably not. Is there competition? Absolutely,” Louie continued.

While WuXi NextCODE’s knowledge
base, on-the-ground presence in
 the world’s largest markets, and database architecture distinguish the company, others, such as Beijing- based BGI and San Diego, California- based Illumina and Madison, New Jersey-based Quest Diagnostics offer some of the same services.

Others say WuXi NextCODE’s presence in China will keep the company ahead of whatever competition they face.


WuXi NextCODE has been selling consumer-oriented products for more than a year in China, the only country where the company makes such products available. Their
 B2C line-up in China consists of FamilyCODE, a carrier test for prospective parents, RareCODE, a rare disease diagnostic product for children, and HealthCODE, which provides a report showing relative risk for 28 common diseases.

“One of the reasons we chose WuXi NextCODE is that it’s one of the very few companies in China that can access China and other global markets,” said Trency Gu, vice-president at Sequoia Capital China, which led a consortium that recently provided US$240 million in Series B financing for WuXi NextCODE.

“Other companies with consumer products in China are focusing on ancestry and information on how best to exercise or sensitivity to alcohol, whereas WuXi NextCODE is offering products with clear clinical demand.”

“Other companies with consumer products in China are focusing on ancestry and information on how best to exercise or sensitivity to alcohol, whereas WuXi NextCODE is offering products with clear clinical demand.”

Investors in the latest financing round also include Singapore’s Temasek, Yunfeng Capital, which was founded by Alibaba Group chairman Jack Ma, and 3W partners.

“What’s interesting about China is that it’s obviously not just a country, but a collection
 of regions, so there are these mega-cities within China that have tremendous appeal. All
 of them slightly different with different industries, with different populations, with different health issues, and that to me creates a significant opportunity,” Smarason said.

The case for developing the market for AI-leveraged genomic informatics in China is twofold.

First, as announced by the State Council, the country’s highest governing body, China aims to become a global AI leader by 2030, when the total output value of AI industries is targeted to surpass 1 trillion yuan. Reaching that goal will require support for an open and coordinated AI innovation system.

“Chinese State Council documents usually carry a lot of weight and they’re taken to the local governments and the local governments will apply substantial budgets towards the priority,” Kaifu Lee, CEO of venture capital from Sinovation Ventures and former President of Google China, said in a recent webinar on AI in China.

“You can go back to the early 2000s, where another State Council document talked about the decision to go with high-speed rail,” Lee said. “That was over 15 years ago and today China has the world’s most advanced application in high-speed rail.”

men working at a white board


That kind of support bodes well for WuXi NextCODE considering that one of its next steps is to release a software development kit for its platform in a bid to create “an app store,” for which others can also develop genomic AI-leveraged healthcare applications.

“The government is highly supportive in China with not just monetary incentives, but it has also set specific targets to encourage domestic intellectual property creation, talent, and applications in the area of AI-leveraged applications,” Jenny Lee, a Shanghai-based managing partner at GGV Capital said in an interview.

“If [WuXi NextCODE] can build a developer community around its platform and database, the power of the database can be released and shared with more developers, who may not be industry experts but are looking
to innovate in this sector. If the database is comprehensive and global, the company will be able to tap into a more vibrant developer community.”

GGV Capital currently has no investment in WuXi NextCODE.


The Chinese government’s support for the kind of diagnostic tools that WuXi NextCODE is marketing in the country strengthens the case for focusing on that market.

“We think these products have clear clinical demand and social benefits if they are launched in the market and have wide application in the market, especially RareCODE and FamilyCODE,” Sequoia’s Gu said. “There is a very clear social benefit in reducing incidents of birth defects, which cause a lot of burden for society.”

WuXi NextCODE says it has seen interest among administrations at all levels in China to spread the usage of these tools as a matter of public health. In the meantime, the recent investment by Sequoia and its consortium partners will help WuXi NextCODE grow worldwide.

The company plans to use
 the proceeds “to accelerate
 the extension of its platform infrastructure and to bring
 new users and data on board through precision medicine and diagnostics partnerships; the commercialization of its consumer solutions for the China market,” according to a company statement.

As this acceleration happens, the growth in WuXi NextCODE’s platform may drive the change in medical treatments along the lines of the hopes people had when the human genome was first decoded.

“I would argue that we
 will have a much greater understanding of the human genome within the next five years and there will be value in having that data down the road,” IDC’s Louie said.

Note: this article was originally published in the November 2017 issue of the South China Morning Post‘s magazine, The Peak, under the title “The Long Promise of Genomes.”

Filter By:

Recent Posts