Announcing the Release of Papilloma Virus Episteme (PaVE) 2.0

NIH/NIAID Template Banner

The NIAID Bioinformatics and Computational Biology team and Alison McBride, Ph.D. Chief, DNA Tumor Virus Section at Laboratory of Viral Diseases have worked together to launch PaVE 2.0.

About PaVE
The Papilloma Virus Episteme (PaVE) was initiated by NIAID in 2008 to provide a highly curated bioinformatic and knowledge resource for the papillomavirus scientific community ( PaVE rapidly became the “go-to” resource for papillomavirus researchers worldwide because of its trusted and seamless integration of the data and the analytical tools. Contributions to the database and curation are provided by the scientific staff of NIAID and by researchers at the University of Arizona. Since its inception, PaVE has been accessed by >44,000 individual users in 151 countries around the world. The PaVE URL has been cited >620 times in Google Scholar and consistently has 40-50 users per day.

For the last 14 years, PaVE has been a steadfast and invaluable resource. Yet, behind the scenes the software infrastructure became severely outdated. In PaVE 2.0, the underlying libraries and hosting platform of PaVE have been completely upgraded and rebuilt because the original software framework has been deprecated. The PaVE 2.0 team researched and developed open-source cloud-based pipelines for continuous integration and deployment (CI/CD) of both applications and data. PaVE 2.0 is hosted on an on-demand virtual server using the infrastructure-as-code NIAID “Monarch” tech stack. The framework has been upgraded to Python Flask with a JavaScript/JINJA template front end and the database switched from MySQL to Neo4j. A Swagger application programming interface (API) performs all database queries, and executes jobs for BLAST, MAFFT, and the custom L1 typing tool developed specifically for PaVE. The API will also allow potential future programmatic access to the data.

All major tools in PaVE 2.0 such as search using BLAST, L1 typing tool, Locus Viewer, phylogenetic tree generation and viewer, multiple sequence alignment, protein structure viewer have been modernized and enhanced to make it robust and be able to support more users. For example, the new Celery distributed task queue supports longer running tasks (such as large BLAST jobs and the L1 typing tool). Multiple sequence alignment will now use MAFFT instead of CLUSTAL. The protein structure viewer has been upgraded from Jmol to Mol*, the new embeddable viewer used by the Protein Data Bank.

As a new feature, we added the 3D Viewer to explore inside the virion in 3D.

Check out PaVE 2.0 by visiting

About human papillomavirus (HPV)
Five percent of human cancers are due to human papilloma virus (HPV) infection; oncogenic HPV infection is the causative agent of almost all cervical carcinomas, and over 70% of oropharyngeal cancers. In the US, the incidence of HPV-associated oropharyngeal cancer currently outnumbers that of cervical cancer, but cervical cancer causes ~0.34 million deaths globally each year. HPV infection also results in clinical outcomes ranging from asymptomatic infection to verrucae, plantar and filiform warts, and condylomata acuminata. HPV infection is particularly problematic with acquired (e.g., HIV infection or organ transplantation) or inherited immunodeficiencies. Many of these HPV-associated diseases are studied in NIAID, NCI, and other NIH institutes. Although HPV vaccines are currently available for nine HPV types, HPV-associated disease will continue to be a major health burden for many decades to come.

To date, over 440 different human papillomavirus (HPV) types have been identified. Each type is trophic for a specific anatomical niche in the stratified epithelium of the skin or mucosa. Some HPV types are oncogenic, while others cause asymptomatic infection and can be considered part of the human microbiome. The genomic and derived protein sequences of these viruses provide a treasure trove of data for comparative genomics.

 Bookmark and Share