CONTENTS CONFERENCE PROGRAM
DETAILS FOR AUTHORS
PEOPLE
The following tentative schedule gives a sense of how the Conference will proceed. Times are subject to change, though every effort will be made to stick to the opening and closing times of the conference so that people can make travel arrangements. Times for paper presentations may shift slightly, but are not expected to switch days or even between morning and afternoon.
11:00am Registration Opens
1:30pm Conference Convenes
1:30 Special Focus Tutorials: Language Processing of Biological Data
1:30 Special Focus Tutorial 1
NLP Techniques for Information Extraction from Biological Documents
- Resource building and our experience
Professor J. Tsujii, University of Tokyo (Japan)
Demands for Information Extraction (IE) and text mining have been increasing rapidly in the biological and medical sciences. Most of the on-going projects treat Medline abstracts since this is the largest collection of papers in these fields. Compared with newspaper articles, reports, etc., the abstracts in these fields have peculiar characteristics, which make the IE task harder than those we have treated so far. In particular, complex term formations, systematic metaphor, numerous semantic classes and various types of co-ordinations and parenthetical expressions pose serious challenges for existing NLP techniques. In this tutorial, I will talk about our experience of IE in these fields together with resource building attempts.
2:50 Break
3:10 Special Focus Tutorial 2
Profile HMMs and other grammatical models of sequences
Professor Richard Hughey, University of California at Santa Cruz (USA)
Since their introduction to biological sequence analysis a decade ago, hidden Markov models (HMMs) have become a standard tool for sequence alignment and remote homology detection. This tutorial examines the effective use of profile HMMs and provides a taste of other modeling techniques such as generalized HMMs and stochastic context-free grammars.
4:30 Break
4:50 Welcome
5:00 Keynote, "Technology Meets the Entertainment Industry: Building Virtual Humans for Immersive Training"
William Swartout, Institute for Creative Technologies, USC
6:00 PAPERS: Across Human Language Technologies
Guillaume Gravier, Gerasimos Potamianos, Chalapathy Neti (IBM Thomas J. Watson Research Center)
Jayadev Billa, Mohamed Noamany, Amit Srivastava, John Makhoul, Francis Kubala (BBN Technologies)
Konrad Scheffler, Steve Young (Cambridge University)
7:20 Reception ("light" hors d'eouvres)
8:45 Adjourn
7:30am Breakfast (provided)
8:30am PAPERS: Speech Recognition
Teresa Kamm, Gerard G.L. Meyer (Johns Hopkins University)
Horacio Franco, Jing Zheng, John Butzberger, Federico Cesari, Michael Frandsen, Jim Arnold, Ramana Rau, Andreas Stolcke, Victor Abrash (SRI International)
Beth Logan, Pedro J. Moreno, Om Deshmukh (Compaq Computer Corporation)
Steven Greenberg, Hannah Carvery, Leah Hitchcock, Shuangyu Chang (International Computer Science Institute)
10:15 Break
10:45 PAPERS: Summarization
Donna Harman, Paul Over (National Institutes of Standards and Technology)
Barry Schiffman, Ani Nenkova, Kathleen McKeown (Columbia University)
Chin-Yew Lin, Eduard Hovy (USC/ISI)
12:05pm Lunch (provided)
1:30pm SPECIAL SESSION ON LANGUAGE PROCESSING OF BIOLOGICAL DATA
1:30 Intro to special session
1:45 Invited Talk:
Statistical NLP approaches for annotating genes and gene clusters
Dr. Russ Altman,
President, International Society for Computational Biology,
Director, Biomedical Informatics Training Program
Stanford University Medical Center, Stanford University (USA)
Bioinformatics has been driven by a series of data explosions: sequence data, structure data, and functional data (most recently from microarray expression experiments). Another data explosion is the availability of text describing major biological results. Most important biomedical literature since 1966 has been indexed in Medline and is available on the web at PubMed. The literature is an important source of information to help make sense of the other data explosions. In this talk, I will review some of the challenges for natural language processing in biology, and discuss statistical techniques that my laboratory has used for adding knowledge derived from text to the tasks of 1) improving sequence homology searches, 2) assigning controlled terminologies to free text discussions, 3) evaluating the biological coherence of a group of genes, and 4) creating a lexicon of abbreviations.
2:50 PAPER: Biology and Natural Language Processing
3:15 Break POSTER SETUP DURING THIS BREAK
3:45 papers PAPERS: Biology and Natural Language Processing
Udo Hahn, Stefan Schulz (Freiburg University)
M. Ganapathiraju, J. Klein-Seetharaman, R. Rosenfeld, J. Carbonell, R. Reddy (Carnegie Mellon University)
Yuka Tateisi, Tomoko Ohta, Jin-Dong Kim, Hideki Mima, Jun 'ichi Tsujii (CREST, JST)
3:45 Discussion
5:30 Boaster session for poster session
6:15 Poster session with reception ("heavy" hors d'eouvres)
9:30 Adjourn
7:30am Breakfast (provided)
8:50am 2 papers Text Understanding
Peter Clark, Lisabeth Duncan, Heather Holmback, Tom Jenkins, John Thompson (Boeing)
Lenhart Schubert (University of Rochester)
9:45 Boaster session for demonstrations
10:30 Demonstrations ("Science Fair")
12:30pm Lunch (provided)
2:00 PAPERS: Information Retrieval & Text Tracking and Detection
R. Manmatha, H. Sever (University of Massachusetts, Amherst)
Steve Cronen-Townsend, W. Bruce Croft (University of Massachusetts)
Sreenivasa Sista, Richard Schwartz, Timothy R. Leek, John Makhoul (BBN Technologies)
Victor Lavrenko, James Allan, Edward DeGuzman, Daniel La Flamme, Veera Pollard, Steven Thomas (Center for Intelligent Information Retrieval)
3:45 Break
4:15 PAPERS: Machine Translation and Multilingual Systems
Yaser Al-Onaizan, Kevin Knight (USC/ISI)
Tanja Schultz, Qin Jin, Kornel Laskowski, Alicia Tribble, Alex Waibel (Carnegie Mellon University)
Kishore Papineni, Salim Roukos, Todd Ward, John Henderson, Florence Reeder (IBM & MITRE)
George Doddington (NIST)
6:00 Banquet (provided)
8:00 Plenary Demonstration Session
9:30 Adjourn
7:30am Breakfast (provided)
8:30 Panel of Government Sponsors
Gary Strong, National Science Foundation
Charles Wayne, DARPA
John Prange, ARDA
James Bass, DARPA
(other speakers to be arranged)
9:45 Break
10:15 PAPERS: Across Human Language Technologies
Kadri Hacioglu, Wayne Ward (The Center for Spoken Language Research)
John Prager, Jennifer Chu-Carroll, Krzsztof Czuba (IBM T.J. Watson Research Ctr.)
Hiromitsu Nishizaki, Seiichi Nakagawa (Toyohashi University of Technology)
Fei Huang, Alex Waibel (Language Technology Institute, Carnegie Mellon University)
12:00noon Wrap-up session
12:30pm Conference Ends
Ward, John Henderson, Florence Reeder (IBM Research)
Jim Arnold, Ramana Rao, Andreas Stolcke, Victor Abrash (SRI International)
© 2002 HLT 2002
All rights reserved.