i5K Coordinating Group

Conference Call Agenda

Apr 15, 2020

11:00-12:00 (EDT)

I5K in the Time of Corona. The focus will be on setting up a project database.

Attending : Mark Blaxter, Chris Childers, Kevin Hackett, Mike Pfrender, Monica Poelchau, Anna Childers, Meg Allen, Glenn Hanes, Stephen Richards, Brad Coates, Jay Evans, Sue Brown, Surya Saha, Brian Scheffler, Mike Branstetter, Duane McKenna, Mara Lawniczak, Rob Waterhouse, Mary Ann McDowell

  • EBP Update (fringy)
    • Subcommittees: IT subcommittee has finally met and working on their docs
      • Monica's question: What data types will be generated? She will send out an email to the group soliciting feedback for the IT subcommittee
      • Mark: We should reuse what has already been built, in order to preserve interoperability
      • Fringy: how do we process on the data – e.g. analyze 5000 references. Does this infrastructure exist? Computing on the tree of life. DToL intends to have as much stored as possible at the INSDC. They should make the data available as seamlessly as possible – that's their role. However, for a compute problem, that's a question that EBP has to ask and answer.
      • Get all chemoreceptors from all arthropods – that's a thing for ENA to do
      • Monica's other question: if you could do anything with this data – what is currently a roadblock that we should solve?
        • Meg Allen – she frequently goes into transcriptome data/gene expression data. TSA and SRA don't have ways to deal with specific life history stages (insect ontologies aren't well developed and are not used for SRA/TSA metadata submission).
    • Other committees are mostly ground to a halt now
    • Harris is looking for Covid-19 virus receptors in other animals; others are having the same idea and are trying to publish.
    • Value of collections (Kevin is on a government collections committee) – can look for signatures of virus there. It can be difficult to get anything else published now.
  • Expanding (or reducing) membership
    • If anyone has additional members – please send to Kevin
    • There are a lot of people on the list who don't attend much – may be due to time zone problems, or due to lack of interest
    • Defining i5k membership in relation to the Earth BioGenome Project – we may be able to look for new recruits that way.
  • Project Database (Mark, Mara, fringy, Monica): ENA-based
  • Meetings
    • Helsinki – Coronavirus! (Rob)
      • This has been postponed by a year – around 25 July 2021
      • Not sure how the symposium or talks will be set up in 2021 – they may be different, since research will have progressed.
    • ESA-Orlando (Brad)
      • Genomics symposium was accepted; as far as we know, the meeting is still on. The symposium is for 3 hours, and will hit a broad swath of talks surrounding arthropod genomics.
      • Date: 15-18 November 2020
    • Arthropod Genomics Symposium (Mike, Mary Ann, Sue)
      • How will the ICE delay affect the plans for 2021 AGS? It was proposed to start June 10th 2021 at Notre Dame; reservations can be cancelled
      • Should we have a virtual meeting this year? Or an online mini-symposium?
      • What about an online bioinformatics workshop? Webinars? Rebranding i5k webinars for AGS – using the same mailing list? Regular (shorter) presentations that you would have had anyway? I5k webinars – Anna could use some help
      • What if lower participation happens for AGS in 2021 due to ICE? Repeat of similar talks? How to handle time zones?
      • Could the talks that were scheduled for ICE happen through the i5k webinar series? Rob could ask his speaker list, or look through other confirmed speakers. Have several shorter talks, and time for discussion/interaction afterwards. Could organize them according to topic, similar to what AGS already does. Could also have one US morning session (good for people in Europe), one in afternoon (good for US west coast), at similar time to when ICE was already going to happen. Perhaps have it just on one day, rather than broken up over several days – might provide more momentum.
      • Has anyone had success with day-long virtual seminars? Kevin has phoned in, and it worked.
      • Add newsletters to get important information out? This would be more work.
      • Interaction is always the most difficult thing to facilitate – this may be a major task.
      • If it's recorded and broadcast – perhaps some people who have unpublished data won't want to be broadcast.
      • Rob, Sue, Mike, will take the lead on having webinars this year to replace ICE and AGS – Mary Ann and Anna will contribute
    • PAG Ag100Pests Talks (Anna; Surya)
      • It's fairly far out until PAG – this may be a good place for a broader project talk (e.g. Ag100Pest); if possible, pull more talks on specific genomes in the Arthropod Genomics session
    • EBP meeting 6-9 October in Hinxton:
      • Still figuring out whether it should carry on given coronavirus
      • Alternatives to having an in-person meeting: They will either work towards a digital meeting, or cancel, or postpone
  • Project Updates: How are things going in this CoVIDity time? Is progress still possible?
    • DToL (Mark, Mara)
      • Mark sent a link to slides on DToL progress. https://docs.google.com/presentation/d/12OS9yBTVve6dJUxLCQw5bO4kuWoRazoJ1YkbjiD0Bfo/edit#slide=id.g739f0604d3_2_100
      • Goal – generate reference-quality genomes as primary output, as infrastructure for scientific research
      • First 3 years – sequence 2,000 species, references from each family in the British Isles. Corresponds to about 40% of all families.
      • Many arthropods in the list! About 1,000 species of arthropods to be sequenced by DToL.
      • Extensive metadata
      • Specimens with high quality for DNA and metadata are wanted for this project to create the best data possible. Flash frozen from live. Extract good DNA and RNA, long reads and long range (HiC) for assemblies. Assemble, curate and publish.
      • Orthopteran DNA extraction has been difficult.
      • Plan is to sequence from single specimen.
      • Long read data mainly PacBio Sequel2 HiFi data. Effectively Illumina quality but long reads. Pie chart of current status on slide 17. ~200 species in progress, ~60 scaffolded with N50 \> 10Mb and ~100 with sequencing.
      • Interim data portal: https://github.com/darwintreeoflife/darwintreeoflife.data
      • 'Collective portals' where we can all share what we're doing. How do we do this globally? Can we find a report of who is doing what?
      • *Monica will put Mark in touch with the IT subcommittee group
    • Ag100Pests, include Desert Locust (Anna)
      • Now have a list of over 100 species we are working on
    • Beetles (Duane)
    • Cornome (Brad, Jay, Dave)
    • Bees (Brian)
      • No updates – DNA has not been good enough yet; bee DNA isolation is a problem
      • Honeybee genome had GC content issues – DNA degraded fairly quickly. How is DToL handling this?
    • France (Denis)
    • California Ecosystem (fringy)
  • I5K Webinar Update (Anna)
  • New Funding: New Ideas?

NEXT MEETING: May 20, 2020

11:00-12:00 am EDT