LINGUIST List 11.1069

Wed May 10 2000

Support: Summer Internship/Text Analysis

Editor for this issue: Karen Milligan <>


  • Stephen Poteet, Large Scale Text Analysis

    Message 1: Large Scale Text Analysis

    Date: Wed, 10 May 2000 14:26:17 -0700 (PDT)
    From: Stephen Poteet <>
    Subject: Large Scale Text Analysis

    The Large Scale Text Analysis project in Boeing Phantom Works is looking for a summer intern to help perform the following tasks in text mining applications.

    * Building web-based demonstration GUIs for text mining applications

    * Building text anlaysis tools using a Java-based part-of-speech tagger

    The ideal candidate should have at least a BS/BA degree in a related field, and is currently enrolled in a graduate program with special interest in text mining, information retrieval or text analysis. The candidate should be proficient in C/C++, Java, UNIX (Solaris preferred), Windows NT, `Perl/CGI and Javascript, and be familiar with information retrieval, text mining, text analysis, basic parsing, and beginning linear algebra. Experience in web-based GUI building and familiarity with basic linguistic concepts (e.g. parts of speech, noun phrases), knowledge-based systems, and statistics would be a big plus.

    The Large Scale Text Analysis project has invented a unique technology called TRUST (Text Representation Using Subspace Transformation) which combines linear algebra, statistics, natural language processing, and high performance computing to solve a wide range of text mining problems. We are looking for a candidate who would like to develop hands-on experience solving large scale text mining problems in industry, and possibly contribute to new development in the technology.

    Project manager: Anne Kao

    Group manager: Jim Hoard