Harvard Catalyst Profiles: Next Steps in Research Networking
Griffin Weber, MD, PhD
Beth Israel Deaconess Medical Center, Boston, MA; Harvard Medical School, Boston, MA
Abstract share information among institutions. Profiles now
Harvard Catalyst “Profiles” is an open source uses the VIVO 1.1 Ontology to present its data as
research networking website developed by Harvard’s RDF triples, making it appear no different than any
Clinical and Translational Science Center. It contains other node on a VIVO network. 2) Profiles contains
research profiles for more than 20,000 faculty at the simple aggregate count API required for
Harvard, and the software is used at several dozen participation in the Distributed Interoperable
other institutions across the country. In this talk, we Research Experts Collaboration Tool (DIRECT)
will demonstrate how Profiles is addressing four project being launched by the CTSA Research
challenges that we see as next-steps in research Networking Group. 3) Profiles has a custom XML-
networking: federated queries, personalized search, based web service that allows it to perform complex
team formation, and expanding beyond biomedicine. integrated searches across other instances of Profiles.
Introduction Personalized Search & Team Formation. There are
We created investigator profiles for more than 20,000 many factors that influence a person’s decision to
faculty using a variety of data sources, including contact a potential collaborator in addition to the
PubMed, ISI Thomson Web of Knowledge, and number of publications that match a search phrase. A
multiple internal systems. These profiles are linked new personalized search feature in Profiles takes
together through Passive Networks, which are information known about the user, such as
automatically generated based on information known department, office location, faculty rank, research
about investigators. For example, we extract area, and SNA centrality, to suggest investigators
keywords from publications and use this to build who would likely make good collaborators based on
networks of people with similar interests. Users can team science theory5. Profiles also uses these same
also create Active Networks, by looking up people principles to analyze teams of people to predict if
they know and describing their relationships to them, they will form a successful collaboration.
such as “collaborator” or “advisor”.
Beyond Biomedicine. PubMed and MeSH greatly
Methods simplify the process of building research networking
Profiles is one of several networking programs that tools for biomedical investigators, but publication
play a prominent role among the CTSAs. Each repositories and controlled vocabularies are much
attempts to solve particular problems, and together, less available in other disciplines. In expanding
these have great potential for translational research. Profiles to other schools at Harvard, we’ve
Cornell’s VIVO1 utilizes semantic web technologies, discovered which departments benefit most from data
University of Pittsburgh’s Digital Vita2 excels in purchased from Web of Knowledge; we’ve created
biosketch generation, and commercial websites like vocabularies for various fields by mining the
SciLink3 and BioMed Experts4 use public data University’s library catalog of 15 million books; and
sources to create millions of researcher pages. we’ve added data types such as patents, teaching, and
Profiles is unique in its ability to discover and mentoring, which are important to certain fields.
visualize complex network relationships between
researchers, in real-time on a very large scale, using Conclusion
We initially developed Profiles as a tool to help
both public and local administrative databases. We
take advantage of this by performing automated researchers find collaborators within Harvard
Medical School; however, we have since expanded to
social network analysis (SNA) of the Profiles
database on a nightly basis, which generates nearly a other universities and disciplines and used SNA to go
beyond simple Google-like searches. This project
billion data points describing investigators, their
research, and their collaborations. In the past year, we was supported by Grant Number 1 UL1 RR025758-
01, Harvard Catalyst, from the NIH/NCRR.
have converted Profiles to an ontology-based system
and begun importing additional data types, such as References
patents, course catalogs, and books. 1. http://vivo.cornell.edu
Results 2. http://researchgateway.ctsi.pitt.edu/digitalvita
3. http://www.scilink.com
Federated Queries. Profiles is building national
networks of researchers in three ways: 1) A key 4. http://www.biomedexperts.com
5. http://iknow.northwestern.edu
functionality of VIVO is its use of semantic web to
80
Harvard Catalyst Profiles: Next Steps in Research Networking
Griffin Weber, MD, PhD
Beth Israel Deaconess Medical Center, Boston, MA; Harvard Medical School, Boston, MA
Abstract share information among institutions. Profiles now
Harvard Catalyst “Profiles” is an open source uses the VIVO 1.1 Ontology to present its data as
research networking website developed by Harvard’s RDF triples, making it appear no different than any
Clinical and Translational Science Center. It contains other node on a VIVO network. 2) Profiles contains
research profiles for more than 20,000 faculty at the simple aggregate count API required for
Harvard, and the software is used at several dozen participation in the Distributed Interoperable
other institutions across the country. In this talk, we Research Experts Collaboration Tool (DIRECT)
will demonstrate how Profiles is addressing four project being launched by the CTSA Research
challenges that we see as next-steps in research Networking Group. 3) Profiles has a custom XML-
networking: federated queries, personalized search, based web service that allows it to perform complex
team formation, and expanding beyond biomedicine. integrated searches across other instances of Profiles.
Introduction Personalized Search & Team Formation. There are
We created investigator profiles for more than 20,000 many factors that influence a person’s decision to
faculty using a variety of data sources, including contact a potential collaborator in addition to the
PubMed, ISI Thomson Web of Knowledge, and number of publications that match a search phrase. A
multiple internal systems. These profiles are linked new personalized search feature in Profiles takes
together through Passive Networks, which are information known about the user, such as
automatically generated based on information known department, office location, faculty rank, research
about investigators. For example, we extract area, and SNA centrality, to suggest investigators
keywords from publications and use this to build who would likely make good collaborators based on
networks of people with similar interests. Users can team science theory5. Profiles also uses these same
also create Active Networks, by looking up people principles to analyze teams of people to predict if
they know and describing their relationships to them, they will form a successful collaboration.
such as “collaborator” or “advisor”.
Beyond Biomedicine. PubMed and MeSH greatly
Methods simplify the process of building research networking
Profiles is one of several networking programs that tools for biomedical investigators, but publication
play a prominent role among the CTSAs. Each repositories and controlled vocabularies are much
attempts to solve particular problems, and together, less available in other disciplines. In expanding
these have great potential for translational research. Profiles to other schools at Harvard, we’ve
Cornell’s VIVO1 utilizes semantic web technologies, discovered which departments benefit most from data
University of Pittsburgh’s Digital Vita2 excels in purchased from Web of Knowledge; we’ve created
biosketch generation, and commercial websites like vocabularies for various fields by mining the
SciLink3 and BioMed Experts4 use public data University’s library catalog of 15 million books; and
sources to create millions of researcher pages. we’ve added data types such as patents, teaching, and
Profiles is unique in its ability to discover and mentoring, which are important to certain fields.
visualize complex network relationships between
researchers, in real-time on a very large scale, using Conclusion
We initially developed Profiles as a tool to help
both public and local administrative databases. We
take advantage of this by performing automated researchers find collaborators within Harvard
Medical School; however, we have since expanded to
social network analysis (SNA) of the Profiles
database on a nightly basis, which generates nearly a other universities and disciplines and used SNA to go
beyond simple Google-like searches. This project
billion data points describing investigators, their
research, and their collaborations. In the past year, we was supported by Grant Number 1 UL1 RR025758-
01, Harvard Catalyst, from the NIH/NCRR.
have converted Profiles to an ontology-based system
and begun importing additional data types, such as References
patents, course catalogs, and books. 1. http://vivo.cornell.edu
Results 2. http://researchgateway.ctsi.pitt.edu/digitalvita
3. http://www.scilink.com
Federated Queries. Profiles is building national
networks of researchers in three ways: 1) A key 4. http://www.biomedexperts.com
5. http://iknow.northwestern.edu
functionality of VIVO is its use of semantic web to
80