Our work presented at the 2020 Plant and Animal Genome Conference

Jan. 20, 2020
Image
Flier for the Plant and Animal Genome Conference 2020

PAG

The Plant and Animal Genome XXVIII conference (PAG) is the largest agricultural genomics conference in the world, and there is a lot of interest in statistics, sensing, data sharing, and software. PAG is held each year in San Diego, and this year's conference was a great opportunity for me (David) to share the work from our group and learn about a wide range of research, software, and data in the domains of agriculture, high throughput phenomics, and genomics.

There were many great talks and informal conversations with current and (hopefully!) future collaborators. A few of the presentations that described work in our group included:

  1. My presentation at the workshop “Challenges and Opportunities in Plant Science Data Management”. Darwin Campbell, Carolyn Lawrence-Dill, Ian Brown, and Robert Davey organized the workshop. My talk was titled “Software to Streamline Sharing of Agricultural Algorithms and Data” (abstract). It described our efforts to make it easier for scientists to use, contribute to, and publish open software and data. (slides attached below)

  2. Tyson Swetnam presented “The Airborne Environmental Observations Laboratory for Unoccupied Systems (AEOLUS)” (abstract) (slides). This included our work to on the Drone Processing Pipeline to support the data management lifecycle and scale it to leverage high performance computing resources. The talk presents scientific workflows as a “choose your own adventure” story - they need to be flexible enough to meet individual needs, but standardized enough to abstract complex software engineering. He also describes existing software as well as computing infrastructure available through CyVerse and NSF’s network of high performance computers XSEDE.  

  3. Sateesh Peri presented “PhytoOracle: A Scalable, Modular Framework for Phenomics Data Processing and Trait Extraction.” (abstract) This talk described a substantial effort by the Applied Concepts in Cyberinfrastructure class in Fall 2019 to re-architect the TERRA REF pipeline to make it more modular and scalable as well as easier for scientists to use and contribute to. Key features include the first use of new algorithm templates being developed by Chris Schnaufer in our group, as well as adoption of Makeflow workflow engine and Work Queue for scaling to high performance computers. Learn more about the class' software, PhytOracle, in the GitHub repository and documentation that showcase their impressive work. The architecture that they developed was quickly integrated into our Phenomics pipeline development work - re-factoring the software used to process data from the TERRA REF gantry system in Maricopa. And three students are continuing to contribute to this effort.

  4. Anne Thessen presented “Predicting Phenotype from Multi-Scale Genomic and Environment Data using Neural Networks and Knowledge Graphs: An Introduction to the NSF GenoPhenoEnvo Project” (abstract) describing a project we are working on to develop a machine learning framework that can predict plant phenotypes across spatial, temporal, and taxonomic scales from genomics and environmental data. The first year focuses on using TERRA REF data and in the second year we will begin scaling up to use data from the National Ecological Observatory Network and the National Phenology Network. You can learn more about this work at our GenoPhenoEnvo project website.

 

LeBauer PAG Presentation

Image