Outliers are observations that do not follow the statistical distribution of the bulk of the data, and consequently may lead to erroneous results with respect to statistical analy...
The Boeing Commercial Airplanes Wing Responsibility Center (WRC) needed a way to communicate quickly and effectively between its various plant locations. An important requirement ...
Shannon L. Fowler, Anne-Marie J. Novack, Michael J...
This paper describes a method for asking statistical questions about a large text corpus. We exemplify the method by addressing the question, "What percentage of Federal Regi...
Micro-blogs are a challenging new source of information for data mining techniques. Twitter is a micro-blogging service built to discover what is happening at any moment in time, a...
Training a statistical machine translation starts with tokenizing a parallel corpus. Some languages such as Chinese do not incorporate spacing in their writing system, which creat...