Close

HYPHY: COMPREHENSIVE, FAST, AND USER-FRIENDLY SOFTWARE FOR EVOLUTIONARY ANALYSIS

Abstract

HyPhy (://www.hyphy.org) is a scriptable software platform designed to enable flexible and powerful analyses of DNA, RNA, codon, amino acid and other types of sequence data in an evolutionary context. Since their initial formal release in 2005, HyPhy and it's free web server Data Monkey (://www.datamonkey.org) have become stable and mature, cited in over 2,000 peer-reviewed publications, described in multiple book chapters, and been the subject of many invited workshops. In the first four-year funding cycle we focused on improvements and hardening of the "back end" elements of the HyPhy system. In this proposal, while we will still do work at that level, we focus more on the "front end" where interactions with users are key. Aim 1: Re-architect the HyPhy analytical engine and its scripting language 1.1 Redesign the HyPhy Batch Language (HBL) to enhance its productivity, reliability, maintain- ability, portability and reusability; this will be done while maintaining backward compatibility. 1.2 Redevelop the standard library of evolutionary models, standard analyses, and inference procedures to make them self-documenting, easy to learn, easy to extend, robust to inadvertent misapplication, and compliant with data exchange formats and communication protocols. Aim 2: Models and algorithms for large and complex datasets. 2.1 Improve HyPhy performance to handle much larger data sets in a single analysis by accelerating the fundamental operations in hardware and software. 2.2 Allow users to combine sequence and other quantitative data in a likelihood framework. 2.3 Develop a library of analyses for rapidly evolving pathogens and immune repertoires. Aim 3: Web browser based graphical interface for all computing devices. Presently, the majority of HyPhy users interact with the program through www.datamonkey.org. Keeping in mind the demand for such a user experience, we will: 3.1 Implement a complete interface for data exploration, analysis definition, job execution, and result interpretation and visualization as a local web application. This interface will run on computers, tables, and smartphones. 3.2 Develop the computational core of HyPhy as a native browser plug-in, making HyPhy an "app" distributed, maintained, and run entirely in a browser. 3.3 Re-implement datamonkey.org using modern web technologies (node.js); provide a public instance which can be accessed from any instance of HyPhy, and a distribution that can allow labs to set up their own cluster- or cloud-based instances.

People

Funding Source

Project Period

2010-2019