US-India Workshop 2 Program
Thursday 22nd March | Workshop - day 1 |
---|---|
7.30 - 9.00 | Breakfast |
9.00 - 10.30 | Welcome US Welcome – James Williams, Indiana University and PI TransPac3 and ACE programs – 5 mins Astronomy with Cutting-Edge ICT: Making sense of Transients using Geographically dispersed resources Abstract: Massive amounts of data are coming online from new generation sky surveys. Combined with the Geographically diverse archives and processing pipelines that need to run in real-time, these programs are a prime example of needing high bandwidth across continents. The presentation and demonstration will emphasize the critical need for high capacity networks and advanced communication services to exploit US-India collaboration in Astroinformatics and Astrophysics in connection with the current and near-future programs. |
10.30 - 11.00 | Morning Tea |
11.00 - 12.30 | Research Collaboration How the US and India Network Organizations support Research Collaboration This session will provide updates since the last workshop on the network infrastructure that is available to researchers today or will be will be available to researchers in the relatively near term (1-2 years) and describe what the organizations that manage the networks do to facilitiate research collaboration. Abstract: The National Knowledge Network (NKN) is a state-of-the-art multi-gigabit pan-India network for providing a unified high speed network backbone for all knowledge related institutions in the country. The purpose of such a knowledge network goes to the very core of the country's quest for building quality institutions with requisite research facilities and creating a pool of highly trained professionals. The NKN presentation would cover the key highlights of the project, Management overview, Technical overview, NKN connectivity status and key services offered. ERNET - Mr. Dipak Singh, Director of Network Operations, ERNET (Presentation)– 10 minsCurrent and evolving R&E network infrastructure and research support structures in the US and between US and India TransPAC3 - James Williams, Principal Investigator TP3 project - Indiana University (Presentation) – 10 mins Abstract: The talk will provide a brief update of the TEIN project focussing on the connectivity it provides, project plans, and its current and potential future support for global science applications. The Energy & Sciences Network (ESNet) support for Network-enabled Research Collaboration- Eli Dart (Presentation) - 15 mins Infrastructure Q/A session – 25 mins |
12.45 - 14.00 | Lunch |
14.00 - 14.45 | Network-enabled Access to Distant and High Cost Instruments - 45 mins Remote Access to and Control of Berkley Synchrotron Beamlines by Homi Bhabha National Institute (HBNI), Anushaktinagar, Mumbai |
14.45 - 15.30 | Network-enabling of Global Classrooms – 45 mins NKN - A Gateway to a Global Classroom |
15.30 - 16.00 | Afternoon tea |
16.00 - 16.45 | Opportunities for new collaboration - 45 mins A Framework for Persistent Collaborations: PRAGMA Overview, Future, Lessons Learned, and Opportunities for US India Collaborations Abstract: PRAGMA, a 30 institution, international, grass-roots organization, explores and evaluates practical approaches to how cyberinfrastructure software can be used to enable and enhance scientific collaboration among both small and medium sized groups. Scientific "expeditions" are used to define which software components, available from the PRAGMA partnership and elsewhere, need development and experimental evaluation prior to deployment on larger production infrastructures. Regular face-to-face meetings enables the group as a whole to support new science areas; gain insight to cyber developments in a very timely manner; support and sustain experimental testbeds across multiple administrative domains; create training, education, and network building activities; and provide the persistent interactions that engender the trust needed as the foundation of international scientific and infrastructure development collaborations. In this presentation we will introduce PRAGMA as an example of a framework for persistent collaborations that rely on both physical and human networks. We will discuss future directions, lessons learned about collaborations, and present opportunities for collaboration between US and India researchers. Role of HPC in cyberinfrastructure and some experiences in US-India Collaborations. Abstract: This presentation will describe opportunities for institutional and individual collaborations in defining the leading edge in high-end computing, information technologies, and cyberinfrastructure. The talk will highlight the role of high end computing in enabling breakthrough science and engineering in general as well as some of the challenges associated with large-scale simulations. An outline of several significant education and outreach activities as well as collaborative international projects will be provided. The presentation will conclude with mention of the impact of these initiatives on society at large. Fostering Indo-US computational science collaborations Abstract: A major hindrance in academic collaborations is intellectual property sharing. This talk will dwell upon open source development across multiple collaborating institutions paying attention to Science Gateways which provide Web-based environments for scientists and students to perform computational experiments online via Web interfaces using Web services and computational workflows. We believe there are important steps that should be taken to go beyond basic open source to address requirements for building open software communities. In addition to licensing and support tools, open communities must have open processes for making design decisions, accepting code contributions, adding new project members, reporting and resolving problems, and making well-packaged and properly licensed software releases. The Apache Software Foundation provides the infrastructure and mentoring experience to help open source communities address these project governance issues. Additionally, Apache has an interesting requirements (such as developer diversity) that are designed to emphasize the neutrality of the code base (encouraging competitors to have a safe place to cooperate), help sustain their projects through leadership turnover and funding cycles. I would like to discuss how forums like Apache can help US and Indian counter parts can share code and collaborate without worrying about IP and cross-country funding issues. Afternoon Q/A - 15 mins |
17.00 - 17.15 | Sum-up from first day – George McLaughlin, TransPac3 Dinner on your own |
Friday 23rd March | Workshop - day 2 |
---|---|
7.30 - 9.00 | Breakfast |
9.00 - 10.30 | Welcome to day 2 and review of day 1 – James Williams – 10 minutes |
Network-enabling of Medical Research and Drug Discovery Collaboration Protein Structure Modelling on the IUCRG: A BRAF--caBIG® collaboration – 40 mins Abstract: The last few decades have witnessed the evolution of biology from what used to be a purely experimental field, to a high end computational domain, where unrelenting computational power is required to decipher pieces of data generated through high throughput techniques into blocks of information that will help to answer many mysteries of life. To be able to generate knowledge from the oceans of genomic data, enabling technologies like High Performance Computing, Grid Computing and Cloud Computing are the latest weapons in the hands of the modern biologist. The importance of protein structures can be understood easily from the fact that the function of any protein is directly correlated to its structure. The three dimensional structure of a protein directs its function within a cellular environment. Any mutation in the protein sequence leads to changes in its structure which in turn may render the protein non-functional or even attribute some adverse functions leading to diseases like cancer. Over the decades cancer has become one of the most prevalent diseases with an estimate of reaching over 12 million deaths in 2030 according to World Health Organization. Proteins from almost 1% of the human genome have been identified to be involved in oncogenesis. In the absence of resolved structural data (RCSB database has 73974 resolved protein structures as opposed to 534695 sequence entries in UniProtKB) one has to resort to computational techniques to get the 3D structures of proteins in order to properly understand their functions. The Bioinformatics Group at the Centre for Development of Advanced Computing (C-DAC) in collaboration with cancer Biomedical Informatics Grid (caBIG®) has developed a grid-enabled web-based automated pipeline for ab initio as well as homology based prediction of protein structures, with an emphasis on cancer related proteins. The pipeline has been deployed on the Bioinformatics Resources & Applications Facility (BRAF) hosted at C-DAC, Pune India. The upstream component of the pipeline retrieves a protein sequence (according to user input) from the gridPIR service of caBIG® that provides a data resource of high quality annotated information on all protein sequences supported by UniProtKB. The retrieved sequence in a FASTA format is then fed to the prediction pipeline. At its core the pipeline consists of two prediction engines, one ab initio based that uses the ROSETTA prediction algorithm and another homology modelling based that uses the MODPIPE program, for determining the 3D structures. The graphical user interface of the pipeline enables the user to choose various control parameters like which secondary structure prediction algorithms to use, number of iterations, number of output structures, uploading NMR constraint files, e-value etc. Once submitted, the jobs get distributed over multiple processors on the Biogene supercomputing system at BRAF, which significantly reduces the prediction time. The resultant output comes in the form of predicted structures in PDB format and parsed energy log files which can be downloaded by the user. All the file transfers are secured over the network by SFTP. JMol has been integrated within the pipeline to provide a visual inspection of the predicted models. Test cases have been run using the pipeline with a few cancer related proteins, downloaded from The Cancer Genome Atlas (TCGA), where sequence data from various mutated proteins of affected patients are stored and made available in various data formats. Some of these results will be discussed during the presentation. Indo-US Cooperation in biomedical informatics Abstract: | |
10.30 - 11.00 | Morning Tea |
11.00 - 12.00 | Evolving areas of Network-enabled Collaboration Geosciences, Environmental Networks & Cloud Services, and PRAGMA A Knowledge R&D Networked Indo-US Collaboration: A case study in Earth Sciences – 20mins Abstract: Firstly we will cover the role of GEON/PRAGMA projects, initiated primarily by SDSC -UCSD through NSF, in developing CYBERINFRASTRUCTURE in a wide range of Earth Science disciplines in India since 2005. How this “IT head start” helped in the data fusion and visualization of a variety of earth science related data sets . Secondly, we will also highlight significant achievements in producing a new breed of hybrid students in terms of innovative man power development with cross fertilization of different science streams with IT. Thirdly, we provide a review of available data sets that are being generated in India by various organizations and their applications. In conclusion we will make reference of large data sets with a need to build Cloud Cyber-infrastructure a shift from Grid Middleware based Cyber-infrastructure for geosciences. Big Data and Cloud Benchingmarking - 20 mins Abstract: As science collaborations become data-centric—even moving in the direction of joint analysis of large datasets—there is increasing need for cyberinfrastructure to support data-intensive computing, and an opportunity for collaboration in the area of benchmarking "big data" applications at global-scale. Can we build environments that use distributed computing and the cloud-based paradigm in which researchers in the US easily access and analyze scientific data from data archives in India, and vice versa? What type of system and network performance is required to sustain such applications? Is there an opportunity for Indo-US collaborations to study performance issues related to such data-intensive applications, and to develop related benchmarks? Morning Session Q/A - 20 mins. |
12.00 - 13.30 | Lunch |
13.30 - 14.45
| Sustainable US-India Network Enabled Research Collaborations - Where to from here? A panel comprising representatives of the US and Indian governments and scientists, taking into account the contributions made during the workshop, will debate and deliberate on ways to significantly further enhance Indo-US network enabled collaboration. In doing so the panel and participants will try to identify key issues, challenges, obstacles, and opportunities needed for the development of action plans, and identify next steps and future deliverables. This session will be followed by a wide ranging discussion among the participants which will help shape the final workshop recommendations. |
14.45 - 14.55 | Summing up of workshop - James Williams & N. Mohan Ram DG, ERNET India |
14.55 - 15.00 | Closing Remarks - James Williams |