Skip to Main Content U.S. Department of Energy
Information Visualization

Papers

Links to papers are to non-PNNL sites, some of which require a subscription or charge a fee to access the full text of papers.

Choose a year from the list on the right or show all years.

2015

Cook, K. A., Grinstein, G., & Whiting, M. A. (2014). The VAST Challenge: History, Scope, and Outcomes: An introduction to the Special Issue. Information Visualization, 13(4), 301-312.

Encarnacao, L. M., Chuang, Y.-Y., Stork, A., Kasik, D., Rhyne, T.-M., Avila, L., . . . Wong, P. C. (2015). Future Directions in Computer Graphics and Visualization: From CG&A's Editorial Board. IEEE Computer Graphics and Applications, 35(1), 20-32.

Endert, A., Hossain, S. H., Ramakrishnan, N., North, C., Fiaux, P., & Andrews, C. (2014). The Human is the Loop: New Directions for Visual Analytics. Journal of Intelligent Information Systems, 43(3), 411-435.

Gastelum, Z. N., & Henry, M. J. (2014). Lessons Learned from the Development of an Example Precision Information Environment for International Safeguards. PNNL-23962.

Kleese van Dam, K., LaMothe, R. R., Vishnu, A., Smith, W. P., Thomas, M., Sharma, P., . . . Elsethagen, T. O. (2015). Building the Analysis in Motion Infrastructure. PNNL-24340. Retrieved from http://www.pnnl.gov/main/publications/external/technical_reports/PNNL-24340.pdf

Miller, E. A., Robinson, S. M., Prinke, A. M., Anderson, K. K., Webster, J. B., McCall, J. D., & Seifert, C. E. (2015). Adaptively Reevaluated Bayesian Localization (ARBL): A Novel Technique for Radiological Source Localization. Nuclear Instruments and Methods in Physics Research. Section A, Accelerators, Spectrometers, Detectors and Associated Equipment, 784, 332-338.

Potel, M., & Wong, P. C. (2014). Visualizing Twenty Years of Applications. IEEE Computer Graphics and Applications, 34(6), 6-11.

Rohlman, D., Syron, L., Hobbie, K., Anderson, K., Scaffidi, C., Sudakin, D., . . . Kincl, L. (2015). A Community-Based Approach to Developing a Mobile Device for Measuring Ambient Air Exposure, Spatial Location and Respiratory Health. Environmental Justice.

Scholtz, J., Plaisant, C., Whiting, M. A., & Grinstein, G. (2014). Evaluation of Visual Analytics Environments: The Road to the Visual Analytics Science and Technology Challenge Evaluation Methodology. Information Visualization, 13(4), 326-335.

Wood, L. S., Daily, J. A., Henry, M. J., Palmer, B. J., Schuchardt, K. L., Dazlich, D. A., . . . Randall, D. (2015). A Global Climate Model Agent for High Spatial and Temporal Resolution Data. International Journal of High Performance Computing Applications, 29(1), 107-116.

Show all abstracts

2014

Visualizing 20 Years of Applications

Potel M, and PC Wong. 2014. "Visualizing 20 Years of Applications." IEEE Comput. Grap. Appl., 34(6), 6–11. doi:10.1109/mcg.2014.121

Abstract

This issue of IEEE Computer Graphics and Applications marks the 20th anniversary of the Applications department as a regular feature of the magazine. We thought it might be interesting to look back at the 20 years of Applications department articles to assess its evolution over that time. By aggregating all twenty years of articles and applying a little statistical and visual analytics, we’ve uncovered some interesting characteristics and trends we thought we’d share to mark this 20 year milestone.

Psychosocial Modeling of Insider Threat Risk Based on Behavioral and Word Use Analysis

Greitzer FL, LJ Kangas, CF Noonan, CR Brown, and TA Ferryman. 2014. "Psychosocial Modeling of Insider Threat Risk Based on Behavioral and Word Use Analysis." e-Service Journal 9(1):106-138. doi:10.2979/eservicej.9.1.106

Abstract

In many insider crimes, managers and other coworkers observed that the offenders had exhibited signs of stress, disgruntlement, or other issues, but no alarms were raised. Barriers to using such psychosocial indicators include the inability to recognize the signs and the failure to record the behaviors so that they can be assessed. A psychosocial model was developed to assess an employee’s behavior associated with an increased risk of insider abuse. The model is based on case studies and research literature on factors/correlates associated with precursor behavioral manifestations of individuals committing insider crimes. A complementary Personality Factor modeling approach was developed based on analysis to derive relevant personality characteristics from word use. Several implementations of the psychosocial model were evaluated by comparing their agreement with judgments of human resources and management professionals; the personality factor modeling approach was examined using email samples. If implemented in an operational setting, these models should be part of a set of management tools for employee assessment to identify employees who pose a greater insider threat.

Visual Analytics for Power Grid Contingency Analysis

Wong PC, Z Huang, Y Chen, PS Mackey, and S Jin. 2014. "Visual Analytics for Power Grid Contingency Analysis." IEEE Computer Graphics and Applications 34(1):42-51. doi:10.1109/MCG.2014.17

Abstract

Contingency analysis is the process of employing different measures to model scenarios, analyze them, and then derive the best response to remove the threats. This application paper focuses on a class of contingency analysis problems found in the power grid management system. A power grid is a geographically distributed interconnected transmission network that transmits and delivers electricity from generators to end users. The power grid contingency analysis problem is increasingly important because of both the growing size of the underlying raw data that need to be analyzed and the urgency to deliver working solutions in an aggressive timeframe. Failure to do so may bring significant financial, economic, and security impacts to all parties involved and the society at large. The paper presents a scalable visual analytics pipeline that transforms about 100 million contingency scenarios to a manageable size and form for grid operators to examine different scenarios and come up with preventive or mitigation strategies to address the problems in a predictive and timely manner. Great attention is given to the computational scalability, information scalability, visual scalability, and display scalability issues surrounding the data analytics pipeline. Most of the large-scale computation requirements of our work are conducted on a Cray XMT multi-threaded parallel computer. The paper demonstrates a number of examples using western North American power grid models and data.

The Human is the Loop: New Directions for Visual Analytics

Endert, A., Hossain, M. S., Ramakrishnan, N., North, C., Fiaux, P., & Andrews, C. (2014). The human is the loop: new directions for visual analytics. Journal of Intelligent Information Systems, 43(3), 411–435. doi:10.1007/s10844-014-0304-9

Abstract

Visual analytics is the science of marrying interactive visualizations and analytic algorithms to support exploratory knowledge discovery in large datasets. We argue for a shift from a ‘human in the loop’ philosophy for visual analytics to a ‘human is the loop’ viewpoint, where the focus is on recognizing analysts’ work processes, and seamlessly fitting analytics into that existing interactive process. We survey a range of projects that provide visual analytic support contextually in the sensemaking loop, and outline a research agenda along with future challenges.

A High-Performance Workflow System for Subsurface Simulation

Freedman VL, X Chen, SA Finsterle, MD Freshley, I Gorton, LJ Gosink, E Keating, C Lansing, WAM Moeglein, CJ Murray, GSH Pau, EA Porter, S Purohit, ML Rockhold, KL Schuchardt, C Sivaramakrishnan, VV Vesselinov, and SR Waichler. 2014. "A high-performance workflow system for subsurface simulation." Environmental Modelling & Software 55:176-189. doi:10.1016/j.envsoft.2014.01.030

Abstract

Subsurface modeling applications typically neglect uncertainty in the conceptual models, past or future scenarios, and attribute most or all uncertainty to errors in model parameters. In this contribution, uncertainty in technetium-99 transport in a heterogeneous, deep vadose zone is explored with respect to the conceptual model using a next generation user environment called Akuna. Akuna provides a range of tools to manage environmental modeling projects, from managing simulation data to visualizing results from high-performance computational simulators. Core toolsets accessible through the user interface include model setup, grid generation, parameter estimation, and uncertainty quantification. The BC Cribs site at Hanford in southeastern Washington State is used to demonstrate Akuna capabilities. At the BC Cribs site, conceptualization of the system is highly uncertain because only sparse information is available for the geologic conceptual model, the physical and chemical properties of the sediments, and the history of waste disposal operations. Using the Akuna toolset to perform an analysis of conservative solute transport, significant prediction uncertainty in simulated concentrations is demonstrated by conceptual model variation. This demonstrates that conceptual model uncertainty is an important consideration in sparse data environments such as BC Cribs. It is also demonstrated that Akuna and the underlying toolset provides an integrated modeling environment that streamlines model setup, parameter optimization, and uncertainty analyses for high-performance computing applications.

A Global Climate Model Agent for High Spatial and Temporal Resolution Data

Wood LS, JA Daily, MJ Henry, BJ Palmer, KL Schuchardt, DA Dazlich, RP Heikes, and D Randall. 2014. "A Global Climate Model Agent for High Spatial and Temporal Resolution Data." International Journal of High Performance Computing Applications.

Abstract

Fine cell granularity in modern climate models can produce terabytes of data in each snapshot, causing significant I/O overhead. To address this issue, a method of reducing the I/O latency of high-resolution climate models by identifying and selectively outputting regions of interest is presented. Working with a Global Cloud Resolving Model and running with up to 10240 processors on a Cray XE6, this method provides significant I/O bandwidth reduction depending on the frequency of writes and size of the region of interest. The implementation challenges of determining global parameters in a strictly core-localized model and properly formatting output files that only contain subsections of the global grid are addressed, as well as the overall bandwidth impact and benefits of the method. The gains in I/O throughput provided by this method allow dual output rates for high-resolution climate models: a low-frequency global snapshot as well as a high-frequency regional snapshot when events of particular interest occur.

The VAST Challenge: History, Scope, and Outcomes

Cook KA, MA Whiting, and G Grinstein. 2014. "The VAST Challenge: History, Scope, and Outcomes." Information Visualization 13(4):301-312.

Abstract

Visual analytics aims to facilitate human insight from complex data via a combination of visual representations, interaction techniques, and supporting algorithms. To create new tools and techniques that achieve this goal requires that researchers have an understanding of analytical questions to be addressed, data that illustrates the complexities and ambiguities found in realistic analytic settings, and methods for evaluating whether the plausible insights are gained through use of the new methods. However, researchers do not, generally speaking, have access to analysts who can articulate their problems or operational data that is used for analysis. To fill this gap, the Visual Analytics Science and Technology (VAST) Challenge has been held annually since 2006. The VAST Challenge provides an opportunity for researchers to experiment with realistic but not real problems, using realistic synthetic data with known events embedded. Since its inception, the VAST Challenge has evolved along with the visual analytics research community to pose more complex challenges, ranging from text analysis to video analysis to large scale network log analysis. The seven years of the VAST Challenge have seen advancements in research and development, education, evaluation, and in the challenge process itself. This special issue of Information Visualization highlights some of the noteworthy advancements in each of these areas. Some of these papers focus on important research questions related to the challenge itself, and other papers focus on innovative research that has been shaped by participation in the challenge. This paper describes the VAST Challenge process and benefits in detail. It also provides an introduction to and context for the remaining papers in the issue.

Pathways to Identity: Aiding Law Enforcement in Identification Tasks With Visual Analytics

Bruce JR, J Scholtz, H Duncan, L Emanuel, D Stanton-Fraser, S Creese, and OJ Love. 2014. "Pathways to Identity: Aiding Law Enforcement in Identification Tasks With Visual Analytics." Security Informatics, 3(1), 12. doi:10.1186/s13388-014-0012-6

Abstract

The nature of identity has changed dramatically in recent years, and has grown in complexity. Identities are defined in multiple domains: biological and psychological elements strongly contribute, but also biographical and cyber elements are necessary to complete the picture. Law enforcement is beginning to adjust to these changes, recognizing its importance in criminal justice. The SuperIdentity project seeks to aid law enforcement officials in their identification tasks through research of techniques for discovering identity traits, generation of statistical models of identity and analysis of identity traits through visualization. We present use cases compiled through user interviews in multiple fields, including law enforcement, as well as the modeling and visualization tools design to aid in those use cases.

Semantic Interaction for Visual Analytics: Toward Coupling Cognition and Computation

Endert A. 2014. "Semantic Interaction for Visual Analytics: Toward Coupling Cognition and Computation." IEEE Computer Graphics and Applications 34(4):8-15. doi:10.1109/MCG.2014.73

Abstract

The dissertation discussed in this article was written in the midst of an era of digitization. The world is becoming increasingly instrumented with sensors, monitoring, and other methods for generating data describing social, physical, and natural phenomena. Thus, data exist with the potential of being analyzed to uncover, or discover, the phenomena from which it was created. However, as the analytic models leveraged to analyze these data continue to increase in complexity and computational capability, how can visualizations and user interaction methodologies adapt and evolve to continue to foster discovery and sensemaking?

Model for Aggregated Water Heater Load Using Dynamic Bayesian Networks

Vlachopoulou, M., Chin, G., Fuller, J. C., Lu, S., & Kalsi, K. (2012). Model for Aggregated Water Heater Load Using Dynamic Bayesian Networks

Abstract

The transition to the new generation power grid, or “smart grid”, requires novel ways of using and analyzing data collected from the grid infrastructure. Fundamental functionalities like demand response (DR), that the smart grid needs, rely heavily on the ability of the energy providers and distributors to forecast the load behavior of appliances under different DR strategies. This paper presents a new model of aggregated water heater load, based on dynamic Bayesian networks (DBNs). The model has been validated against simulated data from an open source distribution simulation software (GridLAB-D). The results presented in this paper demonstrate that the DBN model accurately tracks the load profile curves of aggregated water heaters under different testing scenarios.

Accessorizing Building Science – A Web Platform to Support Multiple Market Transformation Programs

Madison MC, CA Antonopoulos, ST Dowson, TL Franklin, LC Carlsen, and MC Baechler.  2014.  "Accessorizing Building Science – A Web Platform to Support Multiple Market Transformation Programs."  In 2014 ACEEE Summer Study on Energy Efficiency in Buildings. American Council for an Energy-Efficient Economy. PNNL-SA-101292.

Abstract

As demand for improved energy efficiency in homes increases, builders need easily accessible information on building science measures, installation information, energy codes, and technical requirements for labeling programs. The Building America Solution Center (BASC) is a U.S. Department of Energy website containing hundreds of expert guides designed to help residential builders and other stakeholders access energy efficiency measures for new and existing homes. Users can package measures with other media such as images and architectural drawings, to customize and archive content. BASC content provides technical support to market transformation programs such as ENERGY STAR. This approach has also been adapted for the Better Buildings Residential Program. BASC uses Drupal, an open source content management platform, to combine a variety of media in an interactive manner to make information easily accessible. Developers designed a unique taxonomy to organize and manage content. That taxonomy was translated into web-based modules that allow users to rapidly traverse structured content with related topics and media. This paper presents information on the current design of BASC and the underlying technology used to manage the content. In this paper, we explore features, such as “Field Kits” that allow users to bundle and save content for quick access, along with the ability to export PDF versions of content. Finally, we will discuss development of mobile applications, and a visualization tool for interacting with the Building Science Publications tool that allows the user to dynamically search the entire Building America Library.

User-Centered Design Guidelines for Collaborative Software for Intelligence Analysis

Scholtz J, and A Endert.  2014.  "User-Centered Design Guidelines for Collaborative Software for Intelligence Analysis."  In The 2014 International Conference on Collaboration Technologies and Systems (CTS 2014).

Abstract

In this position paper we discuss the necessity of using User-Centered Design (UCD) methods in order to design collaborative software for the intelligence community. We discuss a number of studies of collaboration in the intelligence community and use this information to provide some guidelines for collaboration software.

Finding Waldo: Learning about Users from their Interactions

Brown ET, Ottley A, Zhao H, Lin Q, Souvenir R, Endert A, Chang R, "Waldo Findings: Learning about Users from their Interactions." IEEE Transactions on Visualization and Computer Graphics (TVCG), 2014.

Abstract

Visual analytics is inherently a collaboration between human and computer. However, in current visual analytics systems, the computer has limited means of knowing about its users and their analysis processes. While existing research has shown that a user’s interactions with a system reflect a large amount of the user’s reasoning process, there has been limited advancement in developing automated, real-time techniques that mine interactions to learn about the user. In this paper, we demonstrate that we can accurately predict a user’s task performance and infer some user personality traits by using machine learning techniques to analyze interaction data. Specifically, we conduct an experiment in which participants perform a visual search task, and apply well-known machine learning algorithms to three encodings of the users’ interaction data. We achieve, depending on algorithm and encoding, between 62% and 83% accuracy at predicting whether each user will be fast or slow at completing the task. Beyond predicting performance, we demonstrate that using the same techniques, we can infer aspects of the user’s personality factors, including locus of control, extraversion, and neuroticism. Further analyses show that strong results can be attained with limited observation time: in one case 95% of the final accuracy is gained after a quarter of the average task completion time. Overall, our findings show that interactions can provide information to the computer about its human collaborator, and establish a foundation for realizing mixed- initiative visual analytics systems.

Discrete Mathematical Approaches to Graph-Based Traffic Analysis.

Joslyn CA, WE Cowley, EA Hogan, and BK Olsen.  2014.  "Discrete Mathematical Approaches to Graph-Based Traffic Analysis."  In 2014 International Workshop on Engineering Cyber Security and Resilience (ECSaR’14)

Abstract

Modern cyber defense and anlaytics requires general, formal models of cyber systems. Multi-scale network models are prime candidates for such formalisms, using discrete mathematical methods based in hierarchically-structured directed multigraphs which also include rich sets of labels. An exemplar of an application of such an approach is traffic analysis, that is, observing and analyzing connections between clients, servers, hosts, and actors within IP networks, over time, to identify characteristic or suspicious patterns. Towards that end, NetFlow (or more generically, IPFLOW) data are available from routers and servers which summarize coherent groups of IP packets flowing through the network. In this paper, we consider traffic analysis of NetFlow using both basic graph statistics and two new mathematical measures involving labeled degree distributions and time interval overlap measures. We do all of this over the VAST test data set of 96M synthetic NetFlow graph edges, against which we can identify characteristic patterns of simulated ground-truth network attacks.

Examining the Role and Research Challenges of Social Media as a Tool for Nonproliferation and Arms Control Treaty Verification

Henry MJ, NO Cramer, JM Benz, ZN Gastelum, SJ Kreyling, and CL West.  2014.  "Examining the Role and Research Challenges of Social Media as a Tool for Nonproliferation and Arms Control Treaty Verification."  In INMM Information Analysis Technologies, Techniques and Methods for Safeguards, Nonproliferation and Arms Control Verification Conference.  

Abstract

The research described in this paper investigates the utility of applying social media signatures as potential arms control and nonproliferation treaty verification tools and technologies, as determined through a series of case studies. The treaty relevant events that these case studies touch upon include detection of undeclared facilities or activities, determination of unknown events recorded by the International Monitoring System (IMS), and the global media response to the occurrence of an Indian missile launch. The case studies examine how social media can be used to fill an information gap and provide additional confidence to a verification activity. The case studies represent, either directly or through a proxy, instances where social media information may be available that could potentially augment the evaluation of an event.

Show all abstracts

2013

Typograph: Multiscale spatial exploration of text documents

Endert, A., Burtner, R., Cramer, N., Perko, R., Hampton, S., & Cook, K. (2013, October). Typograph: Multiscale spatial exploration of text documents In Big Data, 2013 IEEE International Conference on (pp. 17-24). IEEE.

Abstract

Visualizing large document collections using a spatial layout of terms can enable quick overviews of information. These visual metaphors (e.g., word clouds, tag clouds, etc.) traditionally show a series of terms organized by space-filling algorithms. However, often lacking in these views is the ability to interactively explore the information to gain more detail, and the location and rendering of the terms are often not based on mathematical models that maintain relative distances from other information based on similarity metrics. In this paper, we present Typograph, a multi-scale spatial exploration visualization for large document collections. Based on the term-based visualization methods, Typograh enables multiple levels of detail (terms, phrases, snippets, and full documents) within the single spatialization. Further, the information is placed based on their relative similarity to other information to create the “near = similar” geographic metaphor. This paper discusses the design principles and functionality of Typograph and presents a use case analyzing Wikipedia to demonstrate usage.

Interactive Visual Comparison of Multimedia Data through Type-specific Views

Burtner, R., Bohn, S., & Payne, D. (2013, February). Interactive Visual Comparison of Multimedia Data through Type-specific Views. In IS&T/SPIE Electronic Imaging (pp. 86540M-86540M). International Society for Optics and Photonics.

Abstract

Analysts who work with collections of multimedia to perform information foraging understand how difficult it is to connect information across diverse sets of mixed media. The wealth of information from blogs, social media, and news sites often can provide actionable intelligence; however, many of the tools used on these sources of content are not capable of multimedia analysis because they only analyze a single media type. As such, analysts are taxed to keep a mental model of the relationships among each of the media types when generating the broader content picture. To address this need, we have developed Canopy, a novel visual analytic tool for analyzing multimedia. Canopy provides insight into the multimedia data relationships by exploiting the linkages found in text, images, and video co-occurring in the same document and across the collection. Canopy connects derived and explicit linkages and relationships through multiple connected visualizations to aid analysts in quickly summarizing, searching, and browsing collected information to explore relationships and align content. In this paper, we will discuss the features and capabilities of the Canopy system and walk through a scenario illustrating how this system might be used in an operational environment.

MultiFacet: A Faceted Interface for Browsing Large Multimedia Collections

Henry, Michael J; Hampton, Shawn; Endert, Alex; Roberts, Ian; Payne, Deborah. MultiFacet: A Faceted Interface for Browsing Large Multimedia Collections. International Symposium on Multimedia, 2013

Abstract

Faceted browsing is a common technique for exploring collections where the data can be grouped into a number of pre-defined categories, most often generated from textual metadata. Historically, faceted browsing has been applied to a single data type such as text or image data. However, typical collections contain multiple data types, such as information from web pages that contain text, images, and video. Additionally, when browsing a collection of images and video, facets are often created based on the metadata which may be incomplete, inaccurate, or missing altogether instead of the actual visual content contained within those images and video. In this work we address these limitations by presenting MultiFacet, a faceted browsing interface that supports multiple data types. MultiFacet constructs facets for images and video in a collection from the visual content using computer vision techniques. These visual facets can then be browsed in conjunction with text facets within a single interface to reveal relationships and phenomena within multimedia collections. Additionally, we present a use case based on real-world data, demonstrating the utility of this approach towards browsing a large multimedia data collection.

Affinity+: Semi-Structured Brainstorming on Large Displays

Burtner, E. R., May, R. A., Scarberry, R. E., LaMothe, R. R., & Endert, A. (2013). Affinity+: Semi-Structured Brainstorming on Large Displays. (No. PNNL-SA-93014). Pacific Northwest National Laboratory (PNNL), Richland, WA (US).

Abstract

Affinity diagraming is a powerful method for encouraging and capturing lateral thinking in a group environment. The Affinity+ Concept was designed to improve the collaborative brainstorm process through the use of large display surfaces in conjunction with mobile devices like smart phones and tablets. The system works by capturing the ideas digitally and allowing users to sort and group them on a large touch screen manually. Additionally, Affinity+ incorporates theme detection, topic clustering, and other processing algorithms that help bring structured analytic techniques to the process without requiring explicit leadership roles and other overhead typically involved in these activities.

A Global Climate Model Agent for High Spatial and Temporal Resolution Data

Wood, L., Daily, J., Henry, M., Palmer, B., Schuchardt, K., Dazlich, D., ... & Randall, D. (2014). A Global Climate Model Agent for High Spatial and Temporal Resolution Data. International Journal of High Performance Computing Applications, 1094342013518808.

Abstract

Fine cell granularity in modern climate models can produce terabytes of data in each snapshot, causing significant I/O overhead. To address this issue, a method of reducing the I/O latency of high-resolution climate models by identifying and selectively outputting regions of interest is presented. Working with a Global Cloud Resolving Model and running with up to 10240 processors on a Cray XE6, this method provides significant I/O bandwidth reduction depending on the frequency of writes and size of the region of interest. The implementation challenges of determining global parameters in a strictly core-localized model and properly formatting output files that only contain subsections of the global grid are addressed, as well as the overall bandwidth impact and benefits of the method. The gains in I/O throughput provided by this method allow dual output rates for high-resolution climate models: a low-frequency global snapshot as well as a high-frequency regional snapshot when events of particular interest occur.

Digital Disease Detection Mobile App Development - Student Competition Results.

Henry MJ, CF Noonan, CD Corley, MA Antoniak, J Subbiah, YH Chou, M Bhalla, C Yang, Y Zhang, and L Qiao. 2013. "Digital Disease Detection Mobile App Development - Student Competition Results." Presented by Courtney D Corley at International Conference on Digital Disease Detection, San Francisco, CA on September 19, 2013. PNNL-SA-98459.

Abstract

Mobile devices are an enabling technology providing access to health care information and resources, safety and security information, social media and other interactive technologies. These technologies are ideal for addressing the needs of digital disease detection. To explore solutions to biosurveillance challenges Pacific Northwest National Laboratory (PNNL) has partnered with the United States Government Biosurveillance Ecosystem program to host a mobile application contest featuring teams of student researchers interested in addressing these problems. Each team was carefully selected to bring together students interested in the design and development of socially beneficial mobile applications that could be used to meet the needs of biosurveillance practice. The students have been given broad flexibility in terms of the types of applications they can develop, and have been given access to scientists and researchers at PNNL who have domain-specific knowledge to guide them. The aim of the intern competition is to provide support for innovative research in biosurveillance, as well as motivating and inspiring the next generation of researchers who will be at the forefront of investigating these problems. One team has focused on the Android platform and is developing a food safety app, FoodHound. FoodHound combines recall information, outbreak alerts, and restaurant inspection data to help users make informed food choices. Users can report suspected cases of food poisoning and get information about the risks associated with each food. The user interface includes a newsfeed, sharing to social networks, and a modern design. The other team is focusing on the IOS platform and is developing a multi-national influenza surveillance and disease prevention app targeted at teenagers, FL•U. FL•U is user-driven and inspired by the concept of Tamagotchi, FL•U allows user to create one or more customized avatars. The avatars will show various symptoms and reactively according to user-interaction. For example, if user reported that he or she has a fever with 101 degrees F, the avatar’s face will turn to red. The app is targeted towards teenagers in the US and China through bi-lingual interfaces. Other features include vaccination notification, accurate flu-case map and individual health index record displayed intuitively through an info-graphic. This talk will present the finding of the student teams: their applications, the benefit to biosurveillance research, and domains where their applications can provide real benefit to mobile device-enabled populations.

FoodFeed: A Food-Safety Android Application.

Antoniak MA, M Bhalla, YH Chou, J Subbiah, MJ Henry, CF Noonan, and CD Corley. 2013. "FoodFeed: A Food-Safety Android Application." Presented by Maria Antoniak, Mohit Bhalla, Lily Chou, Janani Subbiah, Court Corley at International Society for Disease Surveillance, New Orleans, LA on December 12, 2013. PNNL-SA-98224.

Abstract

FoodFeed, a mobile food safety app for Android devices, was developed as a part of a biosurveillance research project at Pacific Northwest National Laboratory. It uniquely combines food recalls, foodborne disease outbreaks, and restaurant inspection data to help users make informed food choices and better understand the risks involved in consumption of certain foods. Users can report suspected cases of foodborne illness and get information about the risks associated with different food groups and various foodborne diseases. The user interface includes a news feed, sharing to social media and a modern design, all supported by a robust back end system.

Show all abstracts

2012

Semantic Features for Classifying Referring Search Terms

May CJ, MJ Henry, LR McGrath, EB Bell, EJ Marshall, and ML Gregory. 2012. "Semantic Features for Classifying Referring Search Terms." In Proceedings of Northwest Natural Language Processing Conference (NW-NLP 2012), May 11, 2012, Redmond, Washington. University of Washington, Seattle, WA.

Abstract

When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors’ needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests’ countries of origin. A system that can accurately predict the country of origin from query text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.

Top Ten Challenges in Extreme-Scale Visual Analytics

Wong PC, HW Shen, CR Johnson, C Chen, and R Ross. 2012. "Top Ten Challenges in Extreme-Scale Visual Analytics." IEEE Computer Graphics and Applications 32(4):63-67. doi:10.1109/MCG.2012.87

Abstract

A team of scientists and researchers discusses the top 10 challenges in extreme-scale visual analytics (VA). The discussion covers applying VA technologies to both scientific and nonscientific data, evaluating the problems and challenges from both technical and social perspectives.

Extreme Scale Visual Analytics

Wong PC, HW Shen, and V Pascucci. 2012. "Extreme Scale Visual Analytics." IEEE Computer Graphics and Applications 32(4):23-25. doi:10.1109/MCG.2012.73

Abstract

Extreme-scale visual analytics (VA) is about applying VA to extreme-scale data. The articles in this special issue examine advances related to extreme-scale VA problems, their analytical and computational challenges, and their real-world applications.

Scalable Visual Analytics for Power Grid Contingency Analysis

Wong PC, Z Huang, Y Chen, PS Mackey, and S Jin. 2012. "Scalable Visual Analytics for Power Grid Contingency Analysis." PNNL-SA-88323, Pacific Northwest National Laboratory, Richland, WA.

Speech information retrieval: a review

Hafen RP, and MJ Henry. 2012. "Speech information retrieval: a review." Multimedia Systems 18(6):499-518. doi:10.1007/s00530-012-0266-0

Abstract

Speech is an information-rich component of multimedia. Information can be extracted from a speech signal in a number of different ways, and thus there are several well-established speech signal analysis research fields. These fields include speech recognition, speaker recognition, event detection, and fingerprinting. The information that can be extracted from tools and methods developed in these fields can greatly enhance multimedia systems. In this paper, we present the current state of research in each of the major speech analysis fields. The goal is to introduce enough background for someone new in the field to quickly gain high-level understanding and to provide direction for further study.

Why the CHI Community Should be Involved in Standards: Stories from Three CHI Participants

Lund A, J Scholtz, and N Bevan. 2012. "Why the CHI Community Should be Involved in Standards: Stories from Three CHI Participants." Interactions 19(1):70-74. doi:10.1145/2065327.2065341

Abstract

Public policy increasingly plays a role in influencing the work that we do as HCI researchers, interaction designers, and practitioners. “Public policy” is a broad term that includes both government policy and policy within non-governmental organizations, such as standards bodies. The Interacting with Public Policy forum focuses on topics at the intersection of human- computer interaction and public policy.

A Space-Filling Visualization Technique for Multivariate Small World Graphs

Wong PC, HP Foote, PS Mackey, G Chin, Jr, Z Huang, and JJ Thomas. 2012. "A Space-Filling Visualization Technique for Multivariate Small World Graphs." IEEE Transactions on Visualization and Computer Graphics 18(5):797-809 . doi:10.1109/TVCG.2011.99

Abstract

We introduce an information visualization technique, known as GreenCurve, for large multivariate sparse graphs that exhibit small-world properties. Our fractal-based design approach uses spatial cues to approximate the node connections and thus eliminates the links between the nodes in the visualization. The paper describes a robust algorithm to order the neighboring nodes of a large sparse graph by solving the Fiedler vector of its graph Laplacian, and then fold the graph nodes into a space-filling fractal curve based on the Fiedler vector. The result is a highly compact visualization that gives a succinct overview of the graph with guaranteed visibility of every graph node. GreenCurve is designed with the power grid infrastructure in mind. It is intended for use in conjunction with other visualization techniques to support electric power grid operations. The research and development of GreenCurve was conducted in collaboration with domain experts who understand the challenges and possibilities intrinsic to the power grid infrastructure. The paper reports a case study on applying GreenCurve to a power grid problem and presents a usability study to evaluate the design claims that we set forth.

In Silico Identification Software (ISIS): A Machine Learning Approach to Tandem Mass Spectral Identification of Lipids

Kangas LJ, TO Metz, G Isaac, BT Schrom, B Ginovska-Pangovska, L Wang, L Tan, RR Lewis, and JH Miller. 2012. "In Silico Identification Software (ISIS): A Machine Learning Approach to Tandem Mass Spectral Identification of Lipids." Bioinformatics 28(13):1705-1713. doi:10.1093/bioinformatics/bts194

Abstract

MOTIVATION: Liquid chromatography-mass spectrometry-based metabolomics has gained importance in the life sciences, yet it is not supported by software tools for high throughput identification of metabolites based on their fragmentation spectra. An algorithm (ISIS: in silico identification software) and its implementation are presented and show great promise in generating in silico spectra of lipids for the purpose of structural identification. Instead of using chemical reaction rate equations or rules-based fragmentation libraries, the algorithm uses machine learning to find accurate bond cleavage rates in a mass spectrometer employing collision-induced dissociation tandem mass spectrometry. RESULTS: A preliminary test of the algorithm with 45 lipids from a subset of lipid classes shows both high sensitivity and specificity.

Conversations About Vaccines In Social Media

Corley CD, SD Brown, SJ Rose, and MG Myers. 2012. "Conversations About Vaccines In Social Media." PNWD-SA-9798, Battelle—Pacific Northwest Division, Richland, WA.

Abstract

Concerns about vaccine safety arise quickly—whether supported by evidence or not—leading some to withhold vaccines. Social media permit informed and misinformed persons to rapidly disseminate their opinions. In order to effectively address emerging vaccine safety concerns, it will be necessary to detect them quickly. To do this, we developed tools to identify, quantify and analyze emerging vaccine safety issues in social media.

Coherent Image Layout using an Adaptive Visual Vocabulary

Dillard SE, MJ Henry, SJ Bohn, and LJ Gosink. 2012. "Coherent Image Layout using an Adaptive Visual Vocabulary." In IS&T/SPIE Electronic Imaging. PNNL-SA-92482, Pacific Northwest National Laboratory, Richland, WA Proc. SPIE 8661, Image Processing: Machine Vision Applications VI, 86610Q (March 6, 2013); doi:10.1117/12.2004733

Abstract

When querying a huge image database containing millions of images, the result of the query may still contain many thousands of images that need to be presented to the user. We consider the problem of arranging such a large set of images into a visually coherent layout, one that places similar images next to each other. Image similarity is determined using a bag-of-features model, and the layout is constructed from a hierarchical clustering of the image set by mapping an in-order traversal of the hierarchy tree into a space-filling curve. This layout method provides strong locality guarantees so we are able to quantitatively evaluate performance using standard image retrieval benchmarks. Performance of the bag-of-features method is best when the vocabulary is learned on the image set being clustered. Because learning a large, discriminative vocabulary is a computationally demanding task, we present a novel method for efficiently adapting a generic visual vocabulary to a particular dataset. We evaluate our clustering and vocabulary adaptation methods on a variety of image datasets and show that adapting a generic vocabulary to a particular set of images improves performance on both hierarchical clustering and image retrieval tasks.

2011

Graph Analytics—Lessons Learned and Challenges Ahead

Pak Chung Wong, Chaomei Chen, Carsten Gorg, Ben Shneiderman, John Stasko, Jim Thomas, Graph Analytics—Lessons Learned and Challenges Ahead, IEEE Computer Graphics and Applications, vol. 31, no. 5, pp. 18-29, Sep./Oct. 2011, doi:10.1109/MCG.2011.72

Abstract

Graph analytics is one of the most influential and important R&D topics in the visual analytics community. Researchers with diverse backgrounds from information visualization, human-computer interaction, computer graphics, graph drawing, and data mining have pursued graph analytics from scientific, technical, and social approaches. These studies have addressed both distinct and common challenges. Past successes and mistakes can provide valuable lessons for revising the research agenda. In this article, six researchers from four academic and research institutes identify graph analytics' fundamental challenges and present both insightful lessons learned from their experience and good practices in graph analytics research. The goal is to critically assess those lessons and shed light on how they can stimulate research and draw attention to grand challenges for graph analytics. The article also establishes principles that could lead to measurable standards and criteria for research.

HPC 2011 - A Highly Parallel Implementation of K-Means for Multithreaded Architecture

Mackey PS, JT Feo, PC Wong, and Y Chen. 2011. "A Highly Parallel Implementation of K-Means for Multithreaded Architecture." In 19th High Performance Computing Symposium (HPC 2011): SCS Spring Simulation Multiconference (SpringSim 2011), April 3-7, 2011, Boston, MA. ACM , New York, NY.

Abstract

We present a parallel implementation of the popular k-means clustering algorithm for massively multithreaded computer systems, as well as a parallelized version of the KKZ seed selection algorithm. We demonstrate that as system size increases, sequential seed selection can become a bottleneck. We also present an early attempt at parallelizing k-means that highlights critical performance issues when programming massively multithreaded systems. For our case studies, we used data collected from electric power simulations and run on the Cray XMT.

Collaborative Visualization: Definition, Challenges, and Research Agenda

Isenberg P, N Elmqvist, J Scholtz, D Cernea, KL Ma, and H Hagen. 2011. Collaborative Visualization: Definition, Challenges, and Research Agenda. Information Visualization. doi:10.1177/1473871611412817

Abstract

Collaborative visualization has emerged as a new research direction which offers the opportunity to reach new audiences and application areas for visualization tools and techniques. Technology now allows us to easily connect and collaborate with one another—in settings as diverse as over networked computers, across mobile devices, or using shared displays such as interactive walls and tabletop surfaces. Any of these collaborative settings carries a set of challenges and opportunities for visualization research. Digital information is already regularly accessed by multiple people together in order to share information, to view it together, to analyze it, or to form decisions. However, research on how to best support collaboration with and around visualizations is still in its infancy and has so far focused only on a small subset of possible application scenarios. The purpose of this article is (1) to provide a clear scope, definition, and overview of the evolving field of collaborative visualization, (2) to help pinpoint the unique focus of collaborative visualization with its specific aspects, challenges, and requirements within the intersection of general computer-supported collaborative work (CSCW) and visualization research, and (3) to draw attention to important future research questions to be addressed by the community. Thus, the goal of the paper is to discuss a research agenda for future work on collaborative visualization, including our vision for how to meet the grand challenge and to urge for a new generation of visualization tools that were designed with collaboration in mind from their very inception.

Report on the Dagstuhl Seminar on Visualization and Monitoring of Network Traffic

Keim D, A Pras, J Schonwalder, PC Wong, and F Mansmann. 2011. Report on the Dagstuhl Seminar on Visualization and Monitoring of Network Traffic. Journal of Network and Systems Management 18(2):232-236. doi:10.1007/s10922-010-9161-1

Abstract

The Dagstuhl Seminar on Visualization and Monitoring of Network Traffic [1] took place May 17-20, 2009 in Dagstuhl, Germany. Dagstuhl seminars promote personal interaction and open discussion of results as well as new ideas. Unlike at most conferences, the focus is not solely on the presentation of established results but to equal parts on results, ideas, sketches, and open problems. The aim of this particular seminar was to bring together experts from the information visualization community and the networking community in order to discuss the state of the art of monitoring and visualization of network traffic. People from the different research communities involved jointly organized the seminar. The co-chairs of the seminar from the networking community were Aiko Pras (University of Twente) and Jürgen Schönwälder (Jacobs University Bremen). The co-chairs from the visualization community were Daniel A. Keim (University of Konstanz) and Pak Chung Wong (Pacific Northwest National Lab). Florian Mansmann (University of Konstanz) helped with producing this report. The seminar was organized and supported by Schloss Dagstuhl and the EC IST-EMANICS Network of Excellence

Crowdsourcing, citizen sensing and Sensor Web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples

Kamel Boulos M, B Resch, DN Crowley, JG Breslin, G Sohn, ER Burtner, WA Pike, E Jeziersk, and KY Slayer Chuang. 2011. "Crowdsourcing, citizen sensing and Sensor Web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples." International Journal of Health Geographics 10:Article No. 67. doi:10.1186/1476-072X-10-67

Abstract

'Wikification of GIS by the masses' is a phrase-term first coined by Kamel Boulos in 2005, two years earlier than Goodchild's term 'Volunteered Geographic Information'. Six years later (2005-2011), OpenStreetMap and Google Earth (GE) are now full-fledged, crowdsourced 'Wikipedias of the Earth' par excellence, with millions of users contributing their own layers to GE, attaching photos, videos, notes and even 3-D (three dimensional) models to locations in GE. From using Twitter in participatory sensing and bicycle-mounted sensors in pervasive environmental sensing, to creating a 100,000-sensor geo-mashup using Semantic Web technology, to the 3-D visualisation of indoor and outdoor surveillance data in real-time and the development of next-generation, collaborative natural user interfaces that will power the spatially-enabled public health and emergency situation rooms of the future, where sensor data and citizen reports can be triaged and acted upon in real-time by distributed teams of professionals, this paper offers a comprehensive state-of-the-art review of the overlapping domains of the Sensor Web, citizen sensing and 'human-in-the-loop sensing' in the era of the Mobile and Social Web, and the roles these domains can play in environmental and public health surveillance and crisis/disaster informatics. We provide an in-depth review of the key issues and trends in these areas, the challenges faced when reasoning and making decisions with real-time crowdsourced data (such as issues of information overload, "noise", misinformation, bias and trust), the core technologies and Open Geospatial Consortium (OGC) standards involved (Sensor Web Enablement and Open GeoSMS), as well as a few outstanding project implementation examples from around the world.

Developing Guidelines for Assessing Visual Analytics Environments

Scholtz J. 2011. "Developing Guidelines for Assessing Visual Analytics Environments." Information Visualization 10(3):212-231. doi:10.1177/1473871611407399

Abstract

In this article, we develop guidelines for evaluating visual analytics environments based on a synthesis of reviews for the entries to the 2009 Visual AnaLytics Science and Technology [VAST] Symposium Challenge and from a user study with professional intelligence analysts. By analyzing the 2009 VAST Challenge reviews, we gained a better understanding of what is important to our reviewers, both visual.ization researchers and professional analysts. We also report on a small user study with professional analysts to determine the important factors that they use in evaluating visual analysis systems. We also looked at guidelines developed by researchers in various domains and synthesized the results from these three efforts into an initial set for use by others in the community. One challenge for future visual analytics systems is to help in the generation of reports. In our user study, we also worked with analysts to understand the criteria they used to evaluate the quality of analytic reports. We propose that this knowledge will be useful as researchers look at systems to automate some of the report generation.1 From these two efforts, we produced some initial guidelines for evaluating visual analytics environments and for the evaluation of analytic reports. It is important to understand that these guidelines are initial drafts and are limited in scope as the visual analytics systems we evaluated were used in specific tasks. We propose these guidelines as a starting point for the Visual Analytics Community.

Facets for Discovery and Exploration in Text Collections

Rose, S; Roberts, I; Cramer, N, "Facets for Discovery and Exploration in Text Collections", IEEE Workshop on Interactive Visual Text Analytics for Decision Making, 2011 IEEE VisWeek.

Abstract

Faceted classifications of text collections provide a useful means of partitioning documents into related groups, however traditional approaches of faceting text collections rely on comprehensive analysis of the subject area or annotated general attributes. In this paper we show the application of basic principles for facet analysis to the development of computational methods for facet classification of text collections. Integration with a visual analytics system is described with summaries of user experiences.

Show all abstracts

2010

Automatic Keyword Extraction from Individual Documents

Rose SJ, DW Engel, NO Cramer, and WE Cowley. 2010. Automatic Keyword Extraction from Individual Documents. Chapter 1 in Text Mining: Application and Theory, vol. 1, ed. MWBerry, J Kogan, pp. 3-20. John Wiley & Sons, Chichester, United Kingdom.

Abstract

This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method's configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

Events and Trends in Text Streams

Engel DW, PD Whitney, and NO Cramer. 2010. Events and Trends in Text Streams. Chapter 9 in Text Mining: Application and Theory, vol. 1, ed. MWBerry, J Kogan, pp. 3-20. John Wiley & Sons, Chichester, United Kingdom.

Abstract

Text streams--collections of documents or messages that are generated and observed over time--are ubiquitous. Our research and development are targeted at developing algorithms to find and characterize changes in topic within text streams. To date, this research has demonstrated the ability to detect and describe 1) short duration, atypical events and 2) the emergence of longer-term shifts in topical content. This technology has been applied to predefined temporally ordered document collections but is also suitable for application to near-real-time textual data streams.

Real-Time Visualization of Network Behaviors for Situational Awareness

Best DM, SJ Bohn, DV Love, AS Wynne, and WA Pike. 2010. Real-Time Visualization of Network Behaviors for Situational Awareness. In Proceedings of the Seventh International Symposium on Visualization for Cyber Security, pp. 79-90. ACM , New York, NY. doi:10.1145/1850795.1850805

Abstract

Plentiful, complex, and dynamic data make understanding the state of an enterprise network difficult. Although visualization can help analysts understand baseline behaviors in network traffic and identify off-normal events, visual analysis systems often do not scale well to operational data volumes (in the hundreds of millions to billions of transactions per day) nor to analysis of emergent trends in real-time data. We present a system that combines multiple, complementary visualization techniques coupled with in-stream analytics, behavioral modeling of network actors, and a high-throughput processing platform called MeDICi. This system provides situational understanding of real-time network activity to help analysts take proactive response steps. We have developed these techniques using requirements gathered from the government users for which the tools are being developed. By linking multiple visualization tools to a streaming analytic pipeline, and designing each tool to support a particular kind of analysis (from high-level awareness to detailed investigation), analysts can understand the behavior of a network across multiple levels of abstraction.

Developing Qualitative Metrics for Visual Analytic Environments

Scholtz J. 2010. Developing Qualitative Metrics for Visual Analytic Environments. In BELIV '10: Beyond time and errors: novel evaluation methods for Information Visualization, A Workshop of the ACM CHI Conference, April 10-11, 2010, Atlanta, Georgia. Association for Computing Machinery, New York, NY.

Abstract

In this paper, we examine reviews for the entries to the 2009 Visual Analytics Science and Technology (VAST) Challenge. By analyzing these reviews we gained a better understanding of what is important to our reviewers, both visualization researchers and professional analysts. This is a bottom up approach to the development of heuristics to use in the evaluation of visual analytic environments. The meta-analysis and the results are presented in this paper.

Multimedia Analysis + Visual Analytics = Multimedia Analytics

Chinchor N, JJ Thomas, PC Wong, M Christel, and MW Ribarsky. 2010. Multimedia Analysis plus Visual Analytics = Multimedia Analytics. IEEE Computer Graphics and Applications 30(5):52-60. doi:10.1109/MCG.2010.92

Abstract

Multimedia analysis has focused on images, video, and to some extent audio and has made progress in single channels excluding text. Visual analytics has focused on the user interaction with data during the analytic process plus the fundamental mathematics and has continued to treat text as did its precursor, information visualization. The general problem we address in this tutorial is the combining of multimedia analysis and visual analytics to deal with multimedia information gathered from different sources, with different goals or objectives, and containing all media types and combinations in common usage.

High-Throughput Real-Time Network Flow Visualization

Best DM, DV Love, WA Pike, and SJ Bohn. 2010. High-Throughput Real-Time Network Flow Visualization. FloCon2010, New Orleans, LA. PNNL-SA-69233.

Abstract

This presentation and demonstration will introduce two interactive, high-throughput visual analysis tools, Traffic Circle and CLIQUE, and will discuss the analytic requirements of the U.S. government cyber security capabilities for which the tools were developed and are being deployed. Both tools take a time-based approach to visual analysis, with Traffic Circle displaying raw data and CLIQUE computing real-time behavioral models. Performance benchmarks will also be discussed; the tools are currently capable of ingesting and presenting data volumes on the order of hundreds of millions of flow records at once.

A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis

Jin S, Z Huang, Y Chen, D Chavarria-Miranda, JT Feo, and PC Wong. 2010. A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), pp. 1-7. Institute of Electrical and Electronics Engineers, Piscataway, NJ. doi:10.1109/IPDPS.2010.5470400

Abstract

In Energy Management Systems, contingency analysis is commonly performed for identifying and mitigating potentially harmful power grid component failures. The exponentially increasing combinatorial number of failure modes imposes a significant computational burden for massive contingency analysis. It is critical to select a limited set of high-impact contingency cases within the constraint of computing power and time requirements to make it possible for real-time power system vulnerability assessment. In this paper, we present a novel application of parallel betweenness centrality to power grid contingency selection. We cross-validate the proposed method using the model and data of the western US power grid, and implement it on a Cray XMT system - a massively multithreaded architecture - leveraging its advantages for parallel execution of irregular algorithms, such as graph analysis. We achieve a speedup of 55 times (on 64 processors) compared against the single-processor version of the same code running on the Cray XMT. We also compare an OpenMP-based version of the same code running on an HP Superdome shared-memory machine. The performance of the Cray XMT code shows better scalability and resource utilization, and shorter execution time for large-scale power grids. This proposed approach has been evaluated in PNNL's Electricity Infrastructure Operations Center (EIOC). It is expected to provide a quick and efficient solution to massive contingency selection problems to help power grid operators to identify and mitigate potential widespread cascading power grid failures in real time.

A Multi-Phase Network Situational Awareness Cognitive Task Analysis

Erbacher R, DA Frincke, PC Wong, S Moody, and GA Fink. 2010. A Multi-Phase Network Situational Awareness Cognitive Task Analysis. Information Visualization 9(3):204-219.

Abstract

The goal of our project is to create a set of next-generation cyber situational-awareness capabilities with applications to other domains in the long term. The objective is to improve the decision-making process to enable decision makers to choose better actions. To this end, we put extensive effort into making certain that we had feedback from network analysts and managers and understand what their genuine needs are. This article discusses the cognitive task-analysis methodology that we followed to acquire feedback from the analysts. This article also provides the details we acquired from the analysts on their processes, goals, concerns, the data and metadata that they analyze. Finally, we describe the generation of a novel task-flow diagram representing the activities of the target user base.

Cognitive Task Analysis of Network Analysts and Managers for Network Situational Awareness

Erbacher R, DA Frincke, PC Wong, S Moody, and GA Fink. 2010. Cognitive Task Analysis of Network Analysts and Managers for Network Situational Awareness. In Proceedings of the SPIE: Visualization and Data Analysis 2010, vol. 7530, ed. J Park, MC Hao, PC Wong and C Chen, p. Art No.: 75300H. SPIE, Bellingham, WA. doi:10.1117/12.845488

Abstract

The goal of the project was to create a set of next generation cyber situational awareness capabilities with applications to other domains in the long term. The goal is to improve the decision making process such that decision makers can choose better actions. To this end, we put extensive effort into ensuring we had feedback from network analysts and managers and understood what their needs truly were. Consequently, this is the focus of this portion of the research. This paper discusses the methodology we followed to acquire this feedback from the analysts, namely a cognitive task analysis. Additionally, this paper provides the details we acquired from the analysts. This essentially provides details on their processes, goals, concerns, the data and meta-data they analyze, etc. A final result we describe is the generation of a task-flow diagram.

GWVis: A Tool for Comparative Ground-Water Data Visualization

Best DM, and RR Lewis. 2010. "GWVis: A Tool for Comparative Ground-Water Data Visualization." Computers & Geosciences 36(11):1436-1442. doi:10.1016/j.cageo.2010.04.006

Abstract

The Ground-Water Visualization application (GWVis) presents ground-water data visually in order to educate the public on ground-water issues. It is also intended for presentations to government and other funding agencies. Current three dimensional models of ground-water are overly complex, while the two dimensional representations (i.e., on paper) are neither comprehensive, nor engaging. At present, GWVis operates on water head elevation data over a given time span, together with a matching (fixed) underlying geography. Two elevation scenarios are compared with each other, typically a control data set (actual field data) and a simulation. Scenario comparison can be animated for the time span provided. We developed GWVis using the Python programming language, associated libraries, and pyOpenGL extension packages to improve performance and control of attributes of the mode (such as color, positioning, scale, and interpolation). GWVis bridges the gap between two dimensional and dynamic three dimensional research visualizations by providing an intuitive, interactive design that allows participants to view the model from different perspectives and to infer information about scenarios. By incorporating scientific data in an environment that can be easily understood, GWVis allows the information to be presented to a large audience base.

Climate Change Impacts on Residential and Commercial Loads in the Western U.S. Grid

Lu N, ZT Taylor, W Jiang, C Jin, J Correia, Jr, LYR Leung, and PC Wong. 2010. "Climate Change Impacts on Residential and Commercial Loads in the Western U.S. Grid." IEEE Transactions on Power Systems 25(1):480-488. doi: 10.1109/TPWRS.2009.2030387

Abstract

This paper presents a multidisciplinary modeling approach to quickly quantify climate change impacts on energy consumption, peak load, and load composition of residential and commercial buildings. This research focuses on addressing the impact of temperature changes on the building cooling load in ten major cities across the Western United States and Canada. Our results have shown that by the mid-century, building yearly energy consumption and peak load will increase in the Southwest. Moreover, the peak load months will spread out to not only the summer months but also spring and autumn months. The Pacific Northwest will experience more hot days in the summer months. The penetration levels of air-conditioning (a/c) systems in this region are likely to increase significantly over the years. As a result, some locations in the Pacific Northwest may be shifted from winter peaking to summer peaking. Overall, the Western U.S. grid may see more simultaneous peaks across the North and South in summer months. Increased cooling load will result in a significant increase in the motor load, which consumes more reactive power and requires stronger voltage support from the grid. This study suggests an increasing need for the industry to implement new technology to increase the efficiency of temperature-sensitive loads and apply proper protection and control to prevent possible adverse impacts of a/c motor loads.

Introduction: Special Issue of Selected Papers from Visualization and Data Analysis 2010

Wong PC, J Park, and M Hao. 2010. "Introduction: Special Issue of Selected Papers from Visualization and Data Analysis 2010." Information Visualization 9(3):165-166. doi:10.1057/ivs.2010.7

Abstract

The annual Visualization and Data Analysis (VDA) conference has grown rapidly since 1994 and has attracted participants throughout the world. The need for visualizing large amounts of information arises in all areas of science, technology, industry, business and in everyday life. New applica- tions must adapt to the requirement of displaying ever-increasing amounts of information.

Jim Thomas: A Collection of Memories

Wong PC. 2010. "Jim Thomas: A Collection of Memories." Information Visualization 9(4):233-234. doi:10.1057/ivs.2010.13

Abstract

Jim Thomas, a guest editor and a long-time associate editor of Information Visualization, died in Richland, WA, on 6th August, 2010 due to complica- tions from a brain tumor. His friends and colleagues from around the world have since expressed their sadness and paid tribute to a visionary scientist in multiple public forums. For those who didn’t get the chance to know Jim, I share a collection of my own memories of Jim Thomas and memories from some of his colleagues.

Jim Thomas, 1946-2010

Stone M, D Kasik, M Bailey, A van Dam, J Dill, TM Rhyne, J Foley, LM Encarnacao, L Rosenblum, R Earnshaw, KL Ma, PC Wong, J Encarnacao, D Fellner, and B Urban. 2010. "Jim Thomas, 1946-2010." IEEE Computer Graphics and Applications 30(6):10-13. doi:10.1109/MCG.2010.113

Abstract

Jim Thomas, a visionary scientist and inspirational leader, died on 6 August 2010 in Richland, Washington. His impact on the fi elds of computer graphics, user interface software, and visualization was extraordinary, his ability to person- ally change people¿s lives even more so. He is remembered for his enthusiasm, his mentorship, his generosity, and, most of all, his laughter. Jim¿s technical accomplishments are well summarized elsewhere. This collection of remembrances images him through the eyes of his many friends and colleagues, who were asked to write a personal note about what Jim meant to them.

Show all abstracts

2009

A Novel Visualization Technique for Electric Power Grid Analytics

Pak Chung Wong; Schneider, K.; Mackey, P.; Foote, H.; Chin, G.; Guttromson, R.; Thomas, J. 2009 A Novel Visualization Technique for Electric Power Grid Analytics. Visualization and Computer Graphics, IEEE Transactions on vol.15, no.3, pp.410-423, May-June 2009

Abstract

The application of information visualization holds tremendous promise for the electric power industry, but its potential has so far not been sufficiently exploited by the visualization community. Prior work on visualizing electric power systems has been limited to depicting raw or processed information on top of a geographic layout. Little effort has been devoted to visualizing the physics of the power grids, which ultimately determines the condition and stability of the electricity infrastructure. Based on this assessment, we developed a novel visualization system prototype, GreenGrid, to explore the planning and monitoring of the North American Electricity Infrastructure. The paper discusses the rationale underlying the GreenGrid design, describes its implementation and performance details, and assesses its strengths and weaknesses against the current geographic-based power grid visualization. We also present a case study using GreenGrid to analyze the information collected moments before the last major electric blackout in the Western United States and Canada, and a usability study to evaluate the practical significance of our design in simulated real-life situations. Our result indicates that many of the disturbance characteristics can be readily identified with the proper form of visualization.

Designing a Collaborative Visual Analytics Tool for Social and Technological Change Prediction.

Wong PC, LYR Leung, N Lu, MJ Scott, PS Mackey, HP Foote, J Correia, Jr, ZT Taylor, J Xu, SD Unwin, and AP Sanfilippo. 2009. Designing a Collaborative Visual Analytics Tool for Social and Technological Change Prediction. IEEE Computer Graphics and Applications 29(5):58-68. doi:10.1109/MCG.2009.92.

Abstract

We describe our ongoing efforts to design and develop a collaborative visual analytics tool to interactively model social and technological change of our society in a future setting. The work involves an interdisciplinary team of scientists from atmospheric physics, electrical engineering, building engineering, social sciences, economics, public policy, and national security. The goal of the collaborative tool is to predict the impact of global climate change on the U.S. power grids and its implications for society and national security. These future scenarios provide critical assessment and information necessary for policymakers and stakeholders to help formulate a coherent, unified strategy toward shaping a safe and secure society. The paper introduces the problem background and related work, explains the motivation and rationale behind our design approach, presents our collaborative visual analytics tool and usage examples, and finally shares the development challenge and lessons learned from our investigation.

The Scalable Reasoning System: Lightweight Visualization for Distributed Analytics

Pike, William.; Bruce, Joe.; Baddeley, Bob.; Best, Daniel.; Franklin, Lyndsey.; May, Richard.; Rice, Douglas.: Riensche, Rick.; Younkin, Katarina The Scalable Reasoning System: Lightweight Visualization for Distributed Analytics. Information Visualization. Vol. 8, no. 1, pp. 71-84. Spring 2009

Abstract

A central challenge in visual analytics is the creation of accessible, widely distributable analysis applications that bring the benefits of visual discovery to as broad a user base as possible. Moreover, to support the role of visualization in the knowledge creation process, it is advantageous to allow users to describe the reasoning strategies they employ while interacting with analytic environments. We introduce an application suite called the scalable reasoning system (SRS), which provides web-based and mobile interfaces for visual analysis. The service-oriented analytic framework that underlies SRS provides a platform for deploying pervasive visual analytic environments across an enterprise. SRS represents a 'lightweight' approach to visual analytics whereby thin client analytic applications can be rapidly deployed in a platform-agnostic fashion. Client applications support multiple coordinated views while giving analysts the ability to record evidence, assumptions, hypotheses and other reasoning artifacts. We describe the capabilities of SRS in the context of a real-world deployment at a regional law enforcement organization.

Application and Evaluation of Analytic Gaming

Riensche RM, LM Martucci, J Scholtz, and MA Whiting. 2009. Application and Evaluation of Analytic Gaming. In 2009 International Conference on Computational Science and Engineering, August 29-31, 2009, Vancouver, Canada, vol. 4, pp. 1169-1173. IEEE Computer Society, Los Alamitos, CA. doi:10.1109/CSE.2009.250

Abstract

We describe an "analytic gaming" framework and methodology, and introduce formal methods for evaluation of the analytic gaming process. This process involves conception, development, and playing of games that are informed by predictive models and driven by players. Evaluation of analytic gaming examines both the process of game development and the results of game play exercises.

The Science of Interaction

Pike WA, JT Stasko, R Chang, and T O'Connell. 2009. The Science of Interaction. Information Visualization 8(4):263-274. doi:10.1057/ivs.2009.22

Abstract

There is a growing recognition with the visual analytics community that interaction and inquiry are inextricable. It is through the interactive manipulation of a visual interface - the analytic discourse - that knowledge is constructed, tested, refined, and shared. This paper reflects on the interaction challenges raised in the original visual analytics research and development agenda and further explores the relationship between interaction and cognition. It identifies recent exemplars of visual analytics research that have made substantive progress toward the goals of a true science of interaction, which must include theories and testable premises about the most appropriate mechanisms for human-information interaction. Six areas for further work are highlighted as those among the highest priorities for the next five years of visual analytics research: ubiquitous, embodied interaction; capturing user intentionality; knowledge-based interfaces; principles of design and perception; collaboration; and interoperability. Ultimately, the goal of a science of interaction is to support the visual analytics community through the recognition and implementation of best practices in the representation of and interaction with visual displays.

The Science of Analytic Reporting

Chinchor N, and WA Pike. 2009. The Science of Analytic Reporting. Information Visualization 8(4):286-293.

Abstract

The challenge of visually communicating analysis results is central to the ability of visual analytics tools to support decision making and knowledge construction. The benefit of emerging visual methods will be improved through more effective exchange of the insights generated through the use of visual analytics. This paper outlines the major requirements for next-generation reporting systems in terms of eight major research needs: the development of best practices, design automation, visual rhetoric, context and audience, connecting analysis to presentation, evidence and argument, collaborative environments, and interactive and dynamic documents. It also describes an emerging technology called Active Products that introduces new techniques for analytic process capture and dissemination.

Visual Analytics Technology Transition Progress

Scholtz J, KA Cook, MA Whiting, DK Lemon, and H Greenblatt. 2009. Visual Analytics Technology Transition Progress. Information Visualization 8(4) (sp. iss.):294-301.

Abstract

The authors provide a description of the transition process for visual analytic tools and contrast this with the transition process for more traditional software tools. This paper takes this into account and describes a user-oriented approach to technology transition including a discussion of key factors that should be considered and adapted to each situation. The progress made in transitioning visual analytic tools in the past five years is described and the challenges that remain are enumerated.

Challenges for Visual Analytics

Thomas JJ, and J Kielman. 2009. Challenges for Visual Analytics. Information Visualization 8(4) (Sp. Iss. SI):309-314.

Abstract

Visual analytics has seen unprecedented growth in its first five years of mainstream existence. Great progress has been made in a short time, yet great challenges must be met in the next decade to provide new technologies that will be widely accepted by societies throughout the world. This paper sets the stage for some of those challenges in an effort to provide the stimulus for the research, both basic and applied, to address and exceed the envisioned potential for visual analytics technologies. We start with a brief summary of the initial challenges, followed by a discussion of the initial driving domains and applications, as well as additional applications and domains that have been a part of recent rapid expansion of visual analytics usage. We look at the common characteristics of several tools illustrating emerging visual analytics technologies, and conclude with the top ten challenges for the field of study. We encourage feedback and collaborative participation by members of the research community, the wide array of user communities, and private industry.

Foundations and Frontiers in Visual Analytics

Kielman J, JJ Thomas, and RA May, II. 2009. Foundations and Frontiers in Visual Analytics. Information Visualization. 8(4):239-246.

Abstract

This introduction and future vision section for this special issue of the Journal of Information Visualization hopes to set the stage for an emerging worldwide effort to advance the state of the science of visual analytics. We present some of the driving needs followed by some foundational principals and methods for advancing this science through partnerships among national laboratories, academia, industry, and the international science community. We will present a selection of the many success stories the science, engineering, and industrial communities have of taking core science research to end users in the field during these early years. Next, we will present some thoughts on the future vision. Finally, we will introduce the 8 papers in this special issue, each one addressing part of that vision.

Visual Analytics: Building a Vibrant and Resilient National Science

Wong PC, and JJ Thomas. 2009. Visual Analytics: Building a Vibrant and Resilient National Science. Information Visualization 8(4) (Sp. Iss. SI):302-308.

Abstract

Five years after the science of visual analytics was formally established, we attempt to use two different studies to assess the current state of the community and evaluate the progress the community has made in the past few years. The first study involves a comparison analysis of intellectual and scholastic accomplishments recently made by the visual analytics community. The second one aims to measure the degree of community reach and internet penetration of visual-analytics-related resources. This paper describes our efforts to harvest the study data, conduct analysis, and make interpretations based on parallel comparisons with five other established computer science areas.

A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics

Wong PC, PS Mackey, KA Cook, RM Rohrer, HP Foote, and MA Whiting. 2009. A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 147 - 154. IEEE , Piscataway, NJ. doi:10.1109/VAST.2009.5333880

Abstract

This paper presents a working graph analytics model that embraces the strengths of the traditional top-down and bottom-up approaches with a resilient crossover concept to exploit the vast middle-ground information overlooked by the two extreme analytical approaches. Our graph analytics model is developed in collaboration with researchers and users, who carefully studied the functional requirements that reflect the critical thinking and interaction pattern of a real-life intelligence analyst. To evaluate the model, we implement a system prototype, known as GreenHornet, which allows our analysts to test the theory in practice, identify the technological and usage-related gaps in the model, and then adapt the new technology in their work space. The paper describes the implementation of GreenHornet and compares its strengths and weaknesses against the other prevailing models and tools.

Describing Story Evolution from Dynamic Information Streams

Rose SJ, RS Butner, WE Cowley, ML Gregory, and J Walker. 2009. Describing Story Evolution from Dynamic Information Streams. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST) VAST 2009, Oct. 12-13, 2009, Atlantic City, NJ, pp. 99-106. IEEE , Piscataway, NJ.

Abstract

Sources of streaming information, such as news syndicates, publish information continuously. Information portals and news aggregators list the latest information from around the world enabling information consumers to easily identify events in the past 24 hours. The volume and velocity of these streams causes information from prior days' to quickly vanish despite its utility in providing an informative context for interpreting new information. Few capabilities exist to support an individual attempting to identify or understand trends and changes from streaming information over time. The burden of retaining prior information and integrating with the new is left to the skills, determination, and discipline of each individual. In this paper we present a visual analytics system for linking essential content from information streams over time into dynamic stories that develop and change over multiple days. We describe particular challenges to the analysis of streaming information and explore visual representations for showing story change and evolution over time.

VAST Contest Dataset Use in Education

Whiting MA, C North, A Endert, J Scholtz, JN Haack, CF Varley, and JJ Thomas. 2009. VAST Contest Dataset Use in Education. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 115 - 122. IEEE, Piscataway, NJ. doi:10.1109/VAST.2009.5333245

Abstract

The IEEE Visual Analytics Science and Technology (VAST) Symposium has held a contest each year since its inception in 2006. These events are designed to provide visual analytics researchers and developers with analytic challenges similar to those encountered by professional information analysts. The VAST contest has had an extended life outside of the symposium, however, as materials are being used in universities and other educational settings, either to help teachers of visual analytics-related classes or for student projects. We describe how we develop VAST contest datasets that results in products that can be used in different settings and review some specific examples of the adoption of the VAST contest materials in the classroom. The examples are drawn from graduate and undergraduate courses at Virginia Tech and from the Visual Analytics "Summer Camp" run by the National Visualization and Analytics Center in 2008. We finish with a brief discussion on evaluation metrics for education.

Two-stage Framework for Visualization of Clustered High Dimensional Data

Choo J, SJ Bohn, and H Park. 2009. Two-stage Framework for Visualization of Clustered High Dimensional Data. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST). Pacific Northwest National Laboratory, Richland, WA. [Unpublished]

Abstract

In this paper, we discuss 2D visualization methods of high dimensional representation of the data that are clustered and their associated label information is available. We propose a two-stage framework for visualization of such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by a supervised dimension reduction method such as linear discriminant analysis that preserves the original cluster structure in terms of its criterion. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, in order to further reduce the dimension to 2 for visualization purposes, we apply another dimension reduction method such as principal component analysis that minimizes the distortion in the lower dimensional representation of the data obtained in the first stage. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.

Analytics for Massive Heat Maps

Love, D.; Bohn, Shawn.; Payne, Deborah.; Nakamura, Grant. 2009. Analytics for Massive Heat Maps. SPIE Visualization and Data Analysis conference, San Jose, 19 January 2009.

Abstract

High throughput instrumentation for genomics is producing data orders of magnitude greater than even a decade before. Biologists often visualize the data of these experiments through the use of heat maps. For large datasets, heat map visualizations do not scale. These visualizations are only capable of displaying a portion of the data, making it difficult for scientists to find and detect patterns that span more than a subsection of the data. We present a novel method that provides an interactive visual display for massive heat maps [O(108)]. Our process shows how a massive heat map can be decomposed into multiple levels of abstraction to represent the underlying macrostructures. We aggregate these abstractions into a framework that can allow near real-time navigation of the space. To further assist pattern discovery, we ground our system on the principle of focus+context. Our framework also addresses the issue of balancing the memory and display resolution and heat map size. We will show that this technique for biologists provides a powerful new visual metaphor for analyzing massive datasets.

User-Centered Evaluation of Technosocial Predictive Analysis

Scholtz J.; Whiting M. 2009. User-Centered Evaluation of Technosocial Predictive Analysis. Association for the Advancement of Artificial Intelligence 2009

Abstract

In today's technology filled world, it is absolutely essential to show the utility of new software, especially software that brings entirely new capabilities to potential users. In the case of technosocial predictive analytics, researchers are developing software capabilities to augment human reasoning and cognition. Getting acceptance and buy-in from analysts and decision makers will not be an easy task. In this position paper, we discuss an approach we are taking for user-centered evaluation that we believe will result in facilitating the adoption of technosocial predictive software by the intelligence community.

Predicting the Impact of Climate Change on U.S. Power Grids and Its Wider Implications on National Security

Wong, P.C.; Ruby Leung, L R.; Lu, N.; Paget, M.; Correia, J Jr.; Jiang, W.; Mackey, P.; Taylor, T. Z.; Xie, Y.; Xu, J.; Unwin, S.; Sanfilippo, A. 2009. Predicting the Impact of Climate Change on U.S. Power Grids and Its Wider Implications on National Security. Association for the Advancement of Artificial Intelligence

Abstract

We discuss our technosocial analytics research and development on predicting and assessing the impact of climate change on U.S. power-grids and the wider implications for national security. The ongoing efforts extend cutting-edge modeling theories derived from climate, energy, social sciences, and national security domains to form a unified system coupled with an interactive visual interface for technosocial analysis. The goal of the system is to create viable future scenarios that address both technical and social factors involved in the model domains. These scenarios enable policymakers to formulate a coherent, unified strategy towards building a safe and secure society. The paper gives an executive summary of our preliminary efforts in the past year and provides a glimpse of our work planned for the second year of a multi-year project being conducted at the Pacific Northwest National Laboratory.

Managing Complex Network Operation with Predictive Analytics

Huang, Z.; Wong, P.C.; Mackey, P.; Chen, Y.; Jian Ma, J.; Schneider, K.; Greitzer, F. L. 2009 Association for the Advancement of Artificial Intelligence

Visual analytics for law enforcement: deploying a service-oriented analytic framework for web-based visualization

Dowson, Scott T.; Bruce, Joe; Best, Daniel M.; Riensche, Roderick M.; Franklin, Lyndsey; Pike, William A. 2009. "Visual analytics for law enforcement: deploying a service-oriented analytic framework for web-based visualization. Association for the Advancement of Artificial Intelligence Proc. SPIE, Vol. 7346, 734603 2009

Abstract

This paper presents key components of the Law Enforcement Information Framework (LEIF), an information system that provides communications, situational awareness, and visual analytics tools in a service-oriented architecture supporting web-based desktop and handheld device users. LEIF simplifies interfaces and visualizations of well-established visual analytic techniques to improve usability. Advanced analytics capability is maintained by enhancing the underlying processing to support the new interface. LEIF development is driven by real-world user feedback gathered through deployments at three operational law enforcement organizations in the U.S. The system incorporates a robust information ingest pipeline supporting a wide variety of information formats. LEIF also insulates interface and analytical components from information sources making it easier to adapt the framework for many different data repositories.

Advancing user-centered evaluation of visual analytic environments through contests.

Costello L, G Grinstein, C Plaisant, and J Scholtz. 2009. "Advancing user-centered evaluation of visual analytic environments through contests." Information Visualization 8(3):230-238.

Abstract

In this paper, the authors describe the Visual Analytics Science and Technology (VAST) Symposium contests run in 2006 and 2007 and the VAST 2008 and 2009 challenges. These contests were designed to provide researchers with a better understanding of the tasks and data that face potential end users. Access to these end users is limited because of time constraints and the classified nature of the tasks and data. In that respect, the contests serve as an intermediary, with the metrics and feedback serving as measures of utility to the end users. The authors summarize the lessons learned and the future directions for VAST Challenges.

Visual-Analytics Evaluation

Plaisant C, G Grinstein, and J Scholtz. 2009. "Visual-Analytics Evaluation." IEEE Computer Graphics and Applications 29(3):16-17. doi:10.1109/MCG.2009.56

Abstract

Visual analytics (VA) is the science of analytical reasoning facilitated by interactive visual interfaces. Assessing VA technology's effectiveness is challenging because VA tools combine several disparate components, both low and high level, integrated in complex interactive systems used by analysts, emergency responders, and others. These components include analytical reasoning, visual representations, computer-human interaction techniques, data representations and transformations, collaboration tools, and especially tools for communicating the results of their use. VA tool users' activities can be exploratory and can take place over days, weeks, or months. Users might not follow a predefined or even linear work flow. They might work alone or in groups. To understand these complex behaviors, an evaluation can target the component level, the system level, or the work environment level, and requires realistic data and tasks. Traditional evaluation metrics such as task completion time, number of errors, or recall and precision are insufficient to quantify the utility of VA tools, and new research is needed to improve our VA evaluation methodology.

Questionnaires for eliciting evaluation data from users of interactive question answering

Kelly D, PB Kantor, E Morse, J Scholtz, and Y Sun. 2009. "Questionnaires for eliciting evaluation data from users of interactive question answering." Natural Language Engineering 15(1):119-141. doi:10.1017/S1351324908004932

Abstract

Evaluating interactive question answering (QA) systems with real users can be challenging because traditional evaluation measures based on the relevance of items returned are difficult to employ since relevance judgments can be unstable in multi-user evaluations. The work reported in this paper evaluates, in distinguishing among a set of interactive QA systems, the effectiveness of three questionnaires: a Cognitive Workload Questionnaire (NASA TLX), and Task and System Questionnaires customized to a specific interactive QA application. These Questionnaires were evaluated with four systems, seven analysts, and eight scenarios during a 2-week workshop. Overall, results demonstrate that all three Questionnaires are effective at distinguishing among systems, with the Task Questionnaire being the most sensitive. Results also provide initial support for the validity and reliability of the Questionnaires.

Visual Analysis of Dynamic Data Streams

Chin G, Jr, M Singhal, GC Nakamura, V Gurumoorthi, and NA Freeman-Cadoret. 2009. "Visual Analysis of Dynamic Data Streams." Information Visualization 8(3):212-229. doi:10.1057/ivs.2009.18

Abstract

For scientific data visualizations, real-time data streams present many interesting challenges when compared to static data. Real-time data are dynamic, transient, high-volume and temporal. Effective visualizations need to be able to accommodate dynamic data behavior as well as Abstract and present the data in ways that make sense to and are usable by humans. The Visual Content Analysis of Real-Time Data Streams project at the Pacific Northwest National Laboratory is researching and prototyping dynamic visualization techniques and tools to help facilitate human understanding and comprehension of high-volume, real-time data. The general strategy of the project is to develop and evolve visual contexts that will organize and orient high-volume dynamic data in conceptual and perceptive views. The goal is to allow users to quickly grasp dynamic data in forms that are intuitive and natural without requiring intensive training in the use of specific visualization or analysis tools and methods. Thus far, the project has prototyped five different visualization prototypes that represent and convey dynamic data through human-recognizable contexts and paradigms such as hierarchies, relationships, time and geography. We describe the design considerations and unique features of these dynamic visualization prototypes as well as our findings in the exploration and evaluation of their use.

Show all abstracts

2008

A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs

Wong PC, HP Foote, PS Mackey, G Chin, Jr, HJ Sofia, and JJ Thomas. 2008. "A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs." Information Visualization 7:105-117.

Abstract

We present an information visualization tool, known as GreenMax, to visually explore large small-world graphs with up to a million graph nodes on a desktop computer. A major motivation for scanning a small-world graph in such a dynamic fashion is the demanding goal of identifying not just the well-known features but also the unknown–known and unknown–unknown features of the graph. GreenMax uses a highly effective multilevel graph drawing approach to pre-process a large graph by generating a hierarchy of increasingly coarse layouts that later support the dynamic zooming of the graph. This paper describes the graph visualization challenges, elaborates our solution, and evaluates the contributions of GreenMax in the larger context of visual analytics on large small-world graphs. We report the results of two case studies using GreenMax and the results support our claim that we can use GreenMax to locate unexpected features or structures behind a graph.

BioGraphE: High-performance bionetwork analysis using the Biological Graph Environment

Chin G, Jr, D Chavarría-Miranda, GC Nakamura, and HJ Sofia. 2008. "BioGraphE: High-performance bionetwork analysis using the Biological Graph Environment." BMC Bioinformatics.

Abstract

We introduce a computational framework for graph analysis called the Biological Graph Environment (BioGraphE), which provides a general, scalable integration platform for connecting graph problems in biology to optimized computational solvers and high-performance systems. This framework enables biology researchers and computational scientists to identify and deploy network analysis applications and to easily connect them to efficient and powerful computational software and hardware that are specifically designed and tuned to solve complex graph problems. In our particular application of BioGraphE to support network analysis in genome biology, we investigate the use of a Boolean satisfiability solver known as Survey Propagation as a core computational solver executing on standard high-performance parallel systems, as well as multi- threaded architectures.

Bringing A Vector/Image Conflation Tool To The Commercial Market

Martucci LM, and B Kovalerchuk. 2008. "Bringing A Vector/Image Conflation Tool To The Commercial Market." In American Society of Photogrammetry and Remote Sensing (ASPRS) 2008 Annual Conference. American Society of Photogrammetry and Remote Sensing (ASPRS), Washington, DC.

Abstract

This paper addresses the conflation problem of integrating/aligning/fusing vector and image data in geospatial products, with special focus on the aspect of bringing a solution to the commercial market. Users of geospatial data in government, military, industry, research, and other sectors have need for accurate displays of information such as roads and other terrain information in areas of interest and operations. Our general approach to vector/raster conflation examines the problem in three activity areas: preprocessing, conflation processing, and postprocessing. We use two well-developed and complementary methodologies with the goal to integrate them into a unified framework for an optimized conflation solution. This research is conducted within an Army Small Business Innovation (SBIR) project with the critically important aspect of pursuing a technology transfer and commercialization strategy that would result in a likely pathway for transition into an operational capability. We describe fundamental principles and generalized roles of participants in the commercialization process. Further, we introduce the concept of putting technically sound products to beneficial use through the steps of (i) defining the specific use scenarios and the respective operational/business environment of that use, and (ii) performing product marketing in accordance with use scenarios and the stimulation of related environments. Several sample scenarios are presented, along with operating/business environments, to demonstrate the concept. The approach assesses the technological readiness of the user for a vector/raster product with a view towards application of a more penetrating market analysis that attempts to pinpoint the technology transition opportunities in a complex and ever expanding geospatial data arena.

Progress and Challenges in Evaluating Tools for Sensemaking

Scholtz JC. 2008. "Progress and Challenges in Evaluating Tools for Sensemaking." Presented at the ACM Computer Human Information (CHI) conference Workshop on Sensemaking in Florence, Italy, April 6, 2008.

Abstract

In this paper we discuss current work and challenges for the development of metrics to evaluate software designed to help analysts with sensemaking activities. While much of the work we describe has been done in the context of intelligence analysis, we are also concerned with the general applicability of metrics and evaluation methodologies for other analytic domains.

Geometry-Based Edge Clustering for Graph Visualization

Cui WW, H Zhou, H Qu, PC Wong, and XM Li. 2008. "Geometry-Based Edge Clustering for Graph Visualization." IEEE Transactions on Visualization and Computer Graphics 14(6):1277 - 1284. doi:10.1109/TVCG.2008.135

Abstract

Graphs have been widely used to model relationships among data. For large graphs, excessive edge crossings make the display visually cluttered and thus difficult to explore. In this paper, we propose a novel geometry-based edge-clustering framework that can group edges into bundles to reduce the overall edge crossings. Our method uses a control mesh to guide the edge-clustering process; edge bundles can be formed by forcing all edges to pass through some control points on the mesh. The control mesh can be generated at different levels of detail either manually or automatically based on underlying graph patterns. Users can further interact with the edge-clustering results through several advanced visualization techniques such as color and opacity enhancement. Compared with other edge-clustering methods, our approach is intuitive, flexible, and efficient. The experiments on some large graphs demonstrate the effectiveness of our method.

Evaluating Visual Analytics at the 2007 VAST Symposium Contest

Plaisant C, G Grinstein, J Scholtz, MA Whiting, T O'Connell, S Laskowski, L Chien, A Tat, W Wright, C Gorg, Z Lui, N Parekh, K Singhal, and JT Stasko. 2008. "Evaluating Visual Analytics at the 2007 VAST Symposium Contest." IEEE Computer Graphics and Applications 28(2):12-21. doi:10.1109/MCG.2008.27

Abstract

In this article, we report on the contest's data set and tasks, the judging criteria, the winning tools, and the overall lessons learned in the competition. We believe that by organizing these contests, we're creating useful resources for researchers and are beginning to understand how to better evaluate VA tools. Competitions encourage the community to work on difficult problems, improve their tools, and develop baselines for others to build or improve upon. We continue to evolve a collection of data sets, scenarios, and evaluation methodologies that reflect the richness of the many VA tasks and applications.

Show all abstracts

2007

Fast Point-Feature Label Placement for Dynamic Visualizations

Mote KD. 2008. "Fast Point-Feature Label Placement for Dynamic Visualizations." Information Visualization 6(4):249-260

Putting Security in Context: Visual Correlation of Network Activity with Real-World Information

Pike WA, SJ Zabriskie, and C Scherrer. 2007. "Putting Security in Context: Visual Correlation of Network Activity with Real-World Information." In Workshop on Visualization for Computer Security 2007 (VizSEC 07). PNNL-SA-57153, Pacific Northwest National Laboratory, Richland, WA.

Scalable Visual Analytics of Massive Textual Datasets

Krishnan M, SJ Bohn, WE Cowley, VL Crow, and J Nieplocha. 2007. "Scalable Visual Analytics of Massive Textual Datasets." In IEEE International Parallel & Distributed Processing Symposium. Long Beach, CA, March 26-30, 2007.

Abstract

This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.

Visual Analysis of Weblog Content

Gregory ML, DA Payne, D McColgin, NO Cramer, and DV Love. 2006. "Visual Analysis of Weblog Content." In International Conference on Weblogs and Social Media '07. pp. 227-230. Boulder, March 26-28, 2007.

Abstract

In recent years, one of the advances of the World Wide Web is social media and one of the fastest growing aspects of social media is the blogosphere. Blogs make content creation easy and are highly accessible through web pages and syndication. With their growing influence, a need has arisen to be able to monitor the opinions and insight revealed within their content. This paper describes a technical approach for analyzing the content of blog data using a visual analytic tool, IN-SPIRE, developed by Pacific Northwest National Laboratory. We will describe both how an analyst can explore blog data with IN-SPIRE and how the tool could be modified in the future to handle the specific nuances of analyzing blog data.

Visual Analytics Science and Technology

Wong PC. 2007. "Visual Analytics Science and Technology." Information Visualization 2007(6):1-2.

Editorial Introduction: Discovering the Unexpected

Cook KA, RA Earnshaw, and JT Stasko. 2007. "Editorial Introduction: Discovering the Unexpected." IEEE Computer Graphics and Applications 27(5):15-19.

Abstract

The marriage of computation, visual representation, and interactive thinking supports intensive analysis. The goal is not only to permit users to detect expected events, such as might be predicted by models, but also to help users discover the unexpected—the surprising anomalies, changes, patterns, and relationships that are then examined and assessed to develop new insight. The Guest Editors discuss the key issues and challenges associated with discovering the unexpected, as well as introduce the articles that make up this Special Issue.

Show all abstracts

2006

Diverse Information Integration and Visualization

Havre SL, A Shah, C Posse, and BM Webb-Robertson. 2006."Diverse Information Integration and Visualization." In Visualization and Data Analysis 2006 (EI10). SPIE The International Society for Optical Engineering, San Jose, CA.

Abstract

This paper presents and explores a technique for visually integrating and exploring diverse information. Society produces, collects, and processes ever larger and diverse data including semi- and un-structured text, as well as transaction, communication, and scientific data. It is no longer sufficient to analyze one type of data or information in isolation. Users need to explore their data/information in the context of related information to discover often hidden, but meaningful, complex relationships. Our approach visualizes multiple, like entities across multiple dimensions where each dimension is a partitioning of the entities. The partitioning may be based on inherent or assigned attributes of the entities (or entity data) such as meta-data or prior knowledge captured in annotations. The partitioning may also be derived from entity data. For example, clustering, or unsupervised classification, can be applied to arrays of multidimensional entity data to partition the entities into groups of similar entities, or clusters. The same entities may be clustered on data from different experiment types or processing approaches. This reduction of diverse data/information on an entity to a series of partitions, or discrete (and unit-less) categories, allows the user to view the entities across a variety of data without concern for data types and units. Parallel coordinates visualize entity data across multiple dimensions of typically continuous attributes. We adapt parallel coordinates for dimensions with discrete attributes (partitions) to allow the comparison of entity partition patterns for identifying trends and outlier entities. We illustrate this approach through a prototype, Juxter (short for Juxtaposer).

From Question Answering to Visual Exploration

McColgin DW, ML Gregory, EG Hetzler, and AE Turner. 2006. "From Question Answering to Visual Exploration." In Proceedings of the ACM SIGIR workshop on Evaluating Exploratory Search Systems, pp. 47-50. Seattle, August 10, 2006.

Abstract

Research in Question Answering has focused on the quality of information retrieval or extraction using the metrics of precision and recall to judge success; these metrics drive toward finding the specific best answer(s) and are best supportive of a lookup type of search. These do not address the opportunity that users' natural language questions present for exploratory interactions. In this paper, we present an integrated Question Answering environment that combines a visual analytics tool for unstructured text and a state-of-the-art query expansion tool designed to compliment the cognitive processes associated with an information analysts work flow. Analysts are seldom looking for factoid answers to simple questions; their information needs are much more complex in that they may be interested in patterns of answers over time, conflicting information, and even related non-answer data may be critical to learning about a problem or reaching prudent conclusions. In our visual analytics tool, questions result in a comprehensive answer space that allows users to explore the variety within the answers and spot related information in the rest of the data. The exploratory nature of the dialog between the user and this system requires tailored evaluation methods that better address the evolving user goals and counter cognitive biases inherent to exploratory search tasks.

Generating Graphs for Visual Analytics through Interactive Sketching

Wong PC, HP Foote, PS Mackey, KA Perrine, and G Chin, JR. 2006. "Generating Graphs for Visual Analytics through Interactive Sketching." IEEE Transactions on Visualization and Computer Graphics Volume 12(Number 6):, doi:10.1109/TVCG.2006.91

Abstract

We introduce an interactive graph generator, GreenSketch, designed to facilitate the creation of descriptive graphs required for different visual analytics tasks. The human-centric design approach of GreenSketch enables users to master the creation process without specific training or prior knowledge of graph model theory. The customized user interface encourages users to gain insight into the connection between the compact matrix representation and the topology of a graph layout when they sketch their graphs. Both the human-enforced and machine-generated randomnesses supported by GreenSketch provide the flexibility needed to address the uncertainty factor in many analytical tasks. This paper describes over two dozen examples that cover a wide variety of graph creations from a single line of nodes to a real-life small-world network that describes a snapshot of telephone connections. While the discussion focuses mainly on the design of GreenSketch, we include a case study that applies the technology in a visual analytics environment and a usability study that evaluates the strengths and weaknesses of our design approach.

Graph Signatures for Visual Analytics

Wong PC, HP Foote, G Chin, JR, PS Mackey, and KA Perrine. 2006. "Graph Signatures for Visual Analytics." IEEE Transactions on Visualization and Computer Graphics 12(6):, doi:10.1109/TVCG.2006.92

Abstract

We present a visual analytics technique to explore graphs using the concept of a data signature. A data signature, in our context, is a multidimensional vector that captures the local topology information surrounding each graph node. Signature vectors extracted from a graph are projected onto a low-dimensional scatterplot through the use of scaling. The resultant scatterplot, which reflects the similarities of the vectors, allows analysts to examine the graph structures and their corresponding real-life interpretations through repeated use of brushing and linking between the two visualizations. The interpretation of the graph structures is based on the outcomes of multiple participatory analysis sessions with intelligence analysts conducted by the authors at the Pacific Northwest National Laboratory. The paper first uses three public domain datasets with either well-known or obvious features to explain the rationale of our design and illustrate its results. More advanced examples are then used in a customized usability study to evaluate the effectiveness and efficiency of our approach. The study results reveal not only the limitations and weaknesses of the traditional approach based solely on graph visualization but also the advantages and strengths of our signature-guided approach presented in the paper.

Have Green - A Visual Analytics Framework for Large Semantic Graphs

Wong PC, G Chin, Jr, HP Foote, PS Mackey, and JJ Thomas. 2006. "Have Green - A Visual Analytics Framework for Large Semantic Graphs." In IEEE Symposium on Visual Analytics Science and Technology, pp 67-74. Baltimore, Maryland, October 31-November 2, 2006.

Abstract

A semantic graph is a network of heterogeneous nodes and links annotated with a domain ontology. In intelligence analysis, investigators use semantic graphs to organize concepts and relationships as graph nodes and links in hopes of discovering key trends, patterns, and insights. However, as new information continues to arrive from a multitude of sources, the size and complexity of the semantic graphs will soon overwhelm an investigator's cognitive capacity to carry out significant analyses. We introduce a powerful visual analytics framework designed to enhance investigators' natural analytical capabilities to comprehend and analyze large semantic graphs. The paper describes the overall framework design, presents major development accomplishments to date, and discusses future directions of a new visual analytics system known as Have Green.

Walking the Path-A New Journey to Explore and Discover through Visual Analytics

Wong PC, SJ Rose, G Chin, Jr, D Frincke, RA May, II, C Posse, AP Sanfilippo, and JJ Thomas. 2006. "Walking the Path-A New Journey to Explore and Discover through Visual Analytics." Information Visualization 5(4):237-249. doi:10.1057/palgrave.ivs.9500133

Abstract

Visual representations are essential aids to human cognitive tasks and are valued to the extent that they provide stable and external reference points upon which dynamic activities and thought processes may be calibrated and upon which models and theories can be tested and confirmed. The active use and manipulation of visual representations makes many complex and intensive cognitive tasks feasible. As described in the recently published "Illuminating the Path", visual analytics is "the science of analytical reasoning facilitated by interactive visual interfaces." We describe research and development at PNNL focused on improving the value that interactive visual representations provide to persons engaged in complex cognitive tasks. We describe work at PNNL that carries forward research from multiple disciplines with a goal to improve the capability of visual representations and present examples whose aim is to improve the extraction, and reasoning about information, knowledge, and data.

Understanding the Dynamics of Collaborative Multi-Party Discourse

Cowell AJ, ML Gregory, JR Bruce, JN Haack, DV Love, SJ Rose, and AH Andrew. 2006. "Understanding the Dynamics of Collaborative Multi-Party Discourse." Information Visualization 5(4):250-259. doi:10.1057/palgrave.ivs.9500139

Abstract

In this paper, we discuss the efforts underway at the Pacific Northwest National Laboratory in understanding the dynamics of multi-party discourse across a number of communication modalities, such as email, instant messaging traffic and meeting data. Two prototype systems are discussed. The Conversation Analysis Tool (ChAT) is an experimental test-bed for the development of computational linguistic components and enables users to easily identify topics or persons of interest within multi-party conversations, including who talked to whom, when, the entities that were discussed, etc. The Retrospective Analysis of Communication Events (RACE) prototype, leveraging many of the ChAT components, is an application built specifically for knowledge workers and focuses on merging different types of communication data so that the underlying message can be discovered in an efficient, timely fashion.

A Visual Analytics Agenda

Thomas JJ, and KA Cook. 2006. "A Visual Analytics Agenda." IEEE Computer Graphics and Applications 26(1):10-13. doi:10.1109/MCG.2006.5

Abstract

Researchers have made significant progress in disciplines such as scientific and information visualization, statistically based exploratory and confirmatory analysis, data and knowledge representations, and perceptual and cognitive sciences. Although some research is being done in this area, the pace at which new technologies and technical talents are becoming available is far too slow to meet the urgent need. National Visualization and Analytics Center's goal is to advance the state of the science to enable analysts to detect the expected and discover the unexpected from massive and dynamic information streams and databases consisting of data of multiple types and from multiple sources, even though the data are often conflicting and incomplete. Visual analytics is a multidisciplinary field that includes the following focus areas: (i) analytical reasoning techniques, (ii) visual representations and interaction techniques, (iii) data representations and transformations, (iv) techniques to support production, presentation, and dissemination of analytical results. The R&D agenda for visual analytics addresses technical needs for each of these focus areas, as well as recommendations for speeding the movement of promising technologies into practice. This article provides only the concise summary of the R&D agenda. We encourage reading, discussion, and debate as well as active innovation toward the agenda for visual analysis.

Show all abstracts

2005

A Typology for Visualizing Uncertainty

Thomson JR, EG Hetzler, A MacEachren, MN Gahegan, and M Pavel. 2005. "A Typology for Visualizing Uncertainty." In Visualization and Data Analysis 2005, Published in Proceedings of the SPIE, vol. 5669, pp. 146-157. SPIE, IS&T, San Jose, CA.

Abstract

Information analysts must rapidly assess information to determine its usefulness in supporting and informing decision makers. In addition to assessing the content, the analyst must also be confident about the quality and veracity of the information. Visualizations can concisely represent vast quantities of information thus aiding the analyst to examine larger quantities of material; however visualization programs are challenged to incorporate a notion of confidence or certainty because the factors that influence the certainty or uncertainty of information vary with the type of information and the type of decisions being made. For example, the assessment of potentially subjective human-reported data leads to a large set of uncertainty concerns in fields such as national security, law enforcement (witness reports), and even scientific analysis where data is collected from a variety of individual observers. What's needed is a formal model or framework for describing uncertainty as it relates to information analysis, to provide a consistent basis for constructing visualizations of uncertainty. This paper proposes an expanded typology for uncertainty, drawing from past frameworks targeted at scientific computing. The typology provides general categories for analytic uncertainty, a framework for creating task-specific refinements to those categories, and examples drawn from the national security field.

Bioinformatic Insights from Metagenomics through Visualization

Havre SL, BM Webb-Robertson, A Shah, C Posse, B Gopalan, and FJ Brockman. 2005. "Bioinformatic Insights from Metagenomics through Visualization." In Proceedings of the IEEE Computational Systems Bioinformatics Conference (CSB 2005). August 8-11, 2005, pp. 341-350. IEEE Computer Society, Los Alamitos, CA.

Abstract

Cutting-edge biological and bioinformatics research seeks a systems perspective through the analysis of multiple types of high-throughput and other experimental data for the same sample. Systems-level analysis requires the integration and fusion of such data, typically through advanced statistics and mathematics. Visualization is a complementary com-putational approach that supports integration and analysis of complex data or its derivatives. We present a bioinformatics visualization prototype, Juxter, which depicts categorical information derived from or assigned to these diverse data for the purpose of comparing patterns across categorizations. The visualization allows users to easily discern correlated and anomalous patterns in the data. These patterns, which might not be detected automatically by algorithms, may reveal valuable information leading to insight and discovery. We describe the visualization and interaction capabilities and demonstrate its utility in a new field, metagenomics, which combines molecular biology and genetics to identify and characterize genetic material from multi-species microbial samples.

Building a Human Information Discourse Interface to Uncover Scenario Content

Sanfilippo AP, BL Baddeley, AJ Cowell, ML Gregory, RE Hohimer, and SC Tratz. 2005. "Building a Human Information Discourse Interface to Uncover Scenario Content." In 2005 International Conference on Intelligence Analysis . Mitre Website, McLean, VA.

Dynamic Visualization of Graphs with Extended Labels

Wong PC, PS Mackey, KA Perrine, JR Eagan, HP Foote, and J Thomas. 2005. "Dynamic Visualization of Graphs with Extended Labels." In 2005 IEEE Symposium on Information Visualization, Los Alamitos, CA, October 2005, pp. 73-80. IEEE, Piscataway, NJ.

Abstract

The paper describes a novel technique to visualize graphs with extended node and link labels. The lengths of these labels range from a short phrase to a full sentence to an entire paragraph and beyond. Our solution is different from all the existing approaches that almost always rely on intensive computational effort to optimize the label placement problem. Instead, we share the visualization resources with the graph and present the label information in static, interactive, and dynamic modes without the requirement for tackling the intractability issues. This allows us to reallocate the computational resources for dynamic presentation of real-time information. The paper includes a user study to evaluate the effectiveness and efficiency of the visualization technique.

Extending the Reach of Augmented Cognition To Real-World Decision Making Tasks

Greitzer FL. 2005. "Extending the Reach of Augmented Cognition To Real-World Decision Making Tasks." In Augmented Cognition International Conference. HCI-International, Las Vegas.

Abstract

The focus of this paper is on the critical challenge of bridging the gap between psychophysiological sensor data and the inferred cognitive states of users. It is argued that a more robust behavioral data collection foundation will facilitate accurate inferences about the state of the user so that an appropriate mitigation strategy, if needed, can be applied. The argument for such a foundation is based on two premises: (1) To realize the envisioned impact of augmented cognition systems, the technology should be applied to a broad, and more cognitively complex, range of real-world problems. (2) To support identifying cognitive states for more complex, real-world tasks, more sophisticated instrumentation will be needed for behavioral data collection. It is argued that such instrumentation would enable inferences to be made about higher-level semantic aspects of performance. The paper describes how instrumentation software developed to support information analysis R&D may serve as an integration environment that can provide additional behavioral data, in context, to facilitate inferences of cognitive state that will enable the successful augmenting of cognitive performance.

InfoStar: An Adaptive Visual Analytics Platform for Mobile Devices

Sanfilippo AP, RA May, II, GR Danielson, RM Riensche, and BL Baddeley. 2005. "InfoStar: An Adaptive Visual Analytics Platform for Mobile Devices." In First International Workshop on Managing Context Information in Mobile and Pervasive Environments. CEUR-WS.org, Ayia Napa, Cyprus.

Abstract

We present the design and implementation of InfoStar, an adaptive Visual Analytics platform for mobile devices such a PDAs, laptops, Tablet PCs and mobile phones. InfoStar extends the reach of visual analytics technology beyond the traditional desktop paradigm to provide ubiquitous access to inter-active visualizations of information spaces. These visualizations are critical in addressing the knowledge needs of human agents operating in the field, in areas as diverse as business, homeland security, law enforcement, protective services, emergency medical services and scientific discovery. We describe an initial real world deployment of this technology, in which the InfoStar platform has been used to offer mobile access to scheduling and venue information to conference attendees at Supercomputing 2004.

Metrics and Measures for Intelligence Analysis Task Difficulty

Greitzer FL, and KM Allwein. 2005. "Metrics and Measures for Intelligence Analysis Task Difficulty ." In First International Conference on Intelligence Analysis Methods and Tools . MITRE Corp, McLean, VA.

Abstract

Recent workshops and conferences supporting the intelligence community (IC) have highlighted the need to characterize the difficulty or complexity of intelligence analysis (IA) tasks in order to facilitate assessments of the impact or effectiveness of IA tools that are being considered for introduction into the IC. Some fundamental issues are: (a) how to employ rigorous methodologies in evaluating tools, given a host of problems such as controlling for task difficulty, effects of time or learning, small-sample size limitations; (b) how to measure the difficulty/complexity of IA tasks in order to establish valid experimental/quasi-experimental designs aimed to support evaluation of tools; and (c) development of more rigorous (summative), performance-based measures of human performance during the conduct of IA tasks, beyond the more traditional reliance on formative assessments (e.g., subjective ratings). Invited discussants will be asked to comment on one or more of these issues, with the aim of bringing the most salient issues and research needs into focus.

New Challenges Facing Integrative Biological Science in the Post-Genomic Era

Oehmen CS, T Straatsma, GA Anderson, G Orr, BM Webb-Robertson, RC Taylor, RW Mooney, DJ Baxter, DR Jones, and DA Dixon. 2005. "New Challenges Facing Integrative Biological Science in the Post-Genomic Era." Journal of Biological Systems.

Abstract

The future of biology will be increasingly driven by the fundamental paradigm shift from hypothesis-driven research to data-driven discovery research employing the massive amounts of available biological data. We identify key technological developments needed to enable this paradigm shift involving (1) the ability to store and manage extremely large datasets which are dispersed over a wide geographical area, (2) development of novel analysis and visualization tools which are capable of operating on enormous data resources without overwhelming researchers with unusable information, and (3) formalisms for integrating mathematical models of biosystems from the molecular level to the organism population level. This will require the development of tools which efficiently utilize high-performance compute power, large storage infrastructures and large aggregate memory architectures. The end result will be the ability of a researcher to integrate complex data from many different sources with simulations to analyze a given system at a wide range of temporal and spatial scales in a single conceptual model.

Turning the Bucket of Text into a Pipe

Hetzler EG, VL Crow, DA Payne, and AE Turner. 2005. "Turning the Bucket of Text into a Pipe." In Proceedings of the IEEE Symposium on Information Visualization. INFOVIS 2005. 23-25 Oct. 2005, pp. 89-94. IEEE, Los Alamitos, CA.

Abstract

Many visual analysis tools operate on a fixed set of data. However, professional information analysts follow issues over a period of time, and need to be able to easily add the new documents to an ongoing exploration. Some analysts handle documents in a moving window of time, with new documents constantly added and old ones aging out. This paper describes both the user interaction and the technical implementation approach for a visual analysis system designed to support constantly evolving text collections.

Scientist-Centered Graph-Based Models of Scientific Knowledge

Chin G, JR, EG Stephan, DK Gracio, OA Kuchar, PD Whitney, and KL Schuchardt. 2005. "Scientist-Centered Graph-Based Models of Scientific Knowledge." In HCI International 2005. 11th International Conference on Human-Computer Interaction, 22-27, July 2005, Caesars Palace, Las Vegas, Nevada USA., p. 10 pages. Lawrence Erlbaum and Associates, Mahwah, NJ.

Abstract

At the Pacific Northwest National Laboratory, we are researching and developing visual models and paradigms that will allow scientists to capture and represent conceptual models in a computational form that may linked to and integrated with scientific data sets and applications. Captured conceptual models may be logical in conveying how individual concepts tie together to form a higher theory, analytical in conveying intermediate or final analysis results, or temporal in describing the experimental process in which concepts are physically and computationally explored. In this paper, we describe and contrast three different research and development systems that allow scientists to capture and interact with computational graph-based models of scientific knowledge. Through these examples, we explore and examine ways in which researchers may graphically encode and apply scientific theory and practice on computer systems.

Top Ten Needs for Intelligence Analysis Tool Development

Badalamente RV, and FL Greitzer. "Top Ten Needs for Intelligence Analysis Tool Development." 2005. In First International Conference on Intelligence Analysis Methods and Tools. MITRE Corp, McLean, VA.

Abstract

The purpose of this paper is to report on the results of R&D to generate ideas about future enhancements to software systems designed to aid the process of intelligence analysis (IA). Use of IA tools in actual settings has revealed significant problems: the user's thought process has not been adequately modeled and is therefore not reflected in the design of analysis tools; users find the tools difficult to learn and use; the tools are not tailored to specific intelligence domains; the tools do not offer an integrated approach (data preprocessing/ingest is a particular problem); the tools do not address the longitudinal nature (continuing over extended periods of time) of the general analysis problem. The aim of this work was to establish an enduring, well-integrated, robust technical foundation for the development and deployment of information-technology (IT)-based IA tools recognized by users and clients as uniquely well designed to meet their varied analysis needs. An overarching strategy or "roadmap" is needed to guide technology development, and a more accurate understanding is needed about how real intelligence analysts do their job. To address these needs, we conducted a facilitated workshop with nine working analysts. An intelligence analysis process model was developed and discussed with the analysts as a point of departure for the discussion. Participants worked in break-out groups to discuss concepts for tools and enhanced products to aid in the IA process. The top ten enhancements identified during the workshop were: seamless data access and ingest; diverse data ingest and fusion; shared electronic folders for collaborative analysis; hypothesis generation and tracking; template for analysis strategy; electronic skills inventory; dynamic data processing and visualization; intelligent tutor for intelligence product development; imagery data resources; intelligence analysis knowledge base. This paper and presentation will discuss the conduct of the workshop and the results obtained.

Toward the Development of Cognitive Task Difficulty Metrics to Support Intelligence Analysis Research

Greitzer FL. 2005. "Toward the Development of Cognitive Task Difficulty Metrics to Support Intelligence Analysis Research." In The Fourth IEEE Conference on Cognitive Informatics, Aug. 8-10, 2005. ICCI 2005, pp. 315-320. Institute of Electrical and Electronics Engineers, Piscataway, NJ.

Abstract

Intelligence analysis is a cognitively complex task that is the subject of considerable research aimed at developing methods and tools to aid the analysis process. To support such research, it is necessary to characterize the difficulty or complexity of intelligence analysis tasks in order to facilitate assessments of the impact or effectiveness of tools that are being considered for deployment. A number of informal accounts of "What makes intelligence analysis hard" are available, but there has been no attempt to establish a more rigorous characterization with well-defined difficulty factors or dimensions. This paper takes an initial step in this direction by describing a set of proposed difficulty metrics based on cognitive principles.

Visual Sample Plan (VSP) Software: Designs and Data Analyses for Sampling Contaminated Buildings

Pulsipher BA, JE Wilson, RO Gilbert, LL Nuffer, and NL Hassig. 2005. "Visual Sample Plan (VSP) Software: Designs and Data Analyses for Sampling Contaminated Buildings." In Proceedings of 24th Annual National Conference on Managing Environmental Quality Systems , vol. 24-2-2, pp. 24-34. US EPA, Washington, DC.

Abstract

A new module of the Visual Sample Plan (VSP) software has been developed to provide sampling designs and data analyses for potentially contaminated buildings. An important application is assessing levels of contamination in buildings after a terrorist attack. This new module, funded by DHS through the Combating Terrorism Technology Support Office, Technical Support Working Group, was developed to provide a tailored, user-friendly and visually-orientated buildings module within the existing VSP software toolkit, the latest version of which can be downloaded from http://vsp.pnnl.gov. In case of, or when planning against, a chemical, biological, or radionuclide release within a building, the VSP module can be used to quickly and easily develop and visualize technically defensible sampling schemes for walls, floors, ceilings, and other surfaces to statistically determine if contamination is present, its magnitude and extent throughout the building and if decontamination has been effective. This paper demonstrates the features of this new VSP buildings module, which include: the ability to import building floor plans or to easily draw, manipulate, and view rooms in several ways; being able to insert doors, windows and annotations into a room; 3-D graphic room views with surfaces labeled and floor plans that show building zones that have separate air handing units. The paper will also discuss the statistical design and data analysis options available in the buildings module. Design objectives supported include comparing an average to a threshold when the data distribution is normal or unknown, and comparing measurements to a threshold to detect hotspots or to insure most of the area is uncontaminated when the data distribution is normal or unknown.

Enabling Proteomics Discovery Through Visual Analysis

Havre SL, M Singhal, DA Payne, MS Lipton, and BJM Webb-Robertson. 2005. "Enabling Proteomics Discovery Through Visual Analysis." IEEE Engineering in Medicine and Biology Magazine 24(3):50-57.

Abstract

This article presents the motivation for developing visual analysis tools for proteomic data and demonstrates their application to proteomics research with a visualization tool named Peptide Permutation and Protein Prediction, or PQuad, a functioning visual analytic tool for the study of systems biology, is in operation at the Pacific Northwest National Laboratory (PNNL). PQuad supports the exploration of proteins identified by proteomic techniques in the context of supplemental biological information. In particular, PQuad supports differential proteomics by simplifying the comparison of peptide sets from different experimental conditions as well as different proteins identification or confidence scoring techniques. Finally, PQuad supports data validation and quality control by providing a variety of resolutions for huge amounts of data to reveal errors undetected by other methods.

Show all abstracts

2004

Analysis Experiences Using Information Visualization

Hetzler, E. and Turner A. 2004. "Analysis experiences using information visualization." IEEE Computer Graphics and Applications, 24:5, pp. 22-26.

Abstract

To deliver truly useful tools, researchers must learn how to map between the knowledge domains inherent in information collections and the knowledge domains in users' minds. The true measure of this work is not what the software shows, but what the user is able to understand by using it. This article summarizes lessons learned from an observational study of the application of the In-Spire visually-oriented text exploitation system in an operational analysis environment.

Supporting Mutual Understanding in a Visual Dialogue Between Analyst and Computer

Chappell AR, AJ Cowell, DA Thurman, and JR Thomson. 2004. "Supporting Mutual Understanding in a Visual Dialogue Between Analyst and Computer." In HFES 2004 proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting: September 20-24, 2004, New Orleans, Louisiana, p. 5 Human Factors & Ergonomics Society, Santa Monica, AB, Canada.

Abstract

The Knowledge Associates for Novel Intelligence (KANI) project is developing a system of automated "associates" to actively support and participate in the information analysis task. The primary goal of KANI is to use automatically extracted information in a reasoning system that draws on the strengths of both a human analyst and automated reasoning. The interface between the two agents is a key element in achieving this goal. The KANI interface seeks to support a visual dialogue with mixed-initiative manipulation of information and reasoning components. To be successful, the interface must achieve mutual understanding between the analyst and KANI of the other's actions. Toward this mutual understanding, KANI allows the analyst to work at multiple levels of abstraction over the reasoning process, links the information presented across these levels to make use of interaction context, and provides querying facilities to allow exploration and explanation.

Visual Analytics

Wong PC, and J Thomas. 2004. "Visual Analytics." IEEE Computer Graphics and Applications, 24:5 pp20-21.

Excerpt:

The information revolution is upon us, and it's guaranteed to change our lives and the way we conduct our daily business. The fact that we have to deal with not just the size but also the variety and complexity of this information makes it a real challenge to survive the revolution. Enter visual analytics, a contemporary and proven approach to combine the art of human intuition and the science of mathematical deduction to directly perceive patterns and derive knowledge and insight from them.

Visual analytics is the formation of abstract visual metaphors in combination with a human information discourse (interaction) that enables detection of the expected and discovery of the unexpected within massive, dynamically changing information spaces. These suites of technologies apply to almost all fields but are being driven by critical needs in biology and national security...

Visualizing Data Streams

Wong PC, HP Foote, DR Adams, WE Cowley, LR Leung, and JJ Thomas. 2004. "Visualizing Data Streams." Chapter 11 in Visual and Spatial Analysis: Advances in Data Mining, Reasoning, and Problem Solving, ed. Boris Kovalerchuk and James Schwing, pp. 265-291,568,569,570,571. Springer, Dordrecht, Netherlands.

Abstract

We introduce two dynamic visualization techniques using multi-dimensional scaling to analyze transient data streams such as newswires and remote sensing imagery. While the time-sensitive nature of these data streams requires immediate attention in many applications, the unpredictable and unbounded characteristics of this information can potentially overwhelm many scaling algorithms that require a full re-computation for every update. We present an adaptive visualization technique based on data stratification to ingest stream information adaptively when influx rate exceeds processing rate. We also describe an incremental visualization technique based on data fusion to project new information directly onto a visualization subspace spanned by the singular vectors of the previously processed neighboring data. The ultimate goal is to leverage the value of legacy and new information and minimize re-processing of the entire dataset in full resolution. We demonstrate these dynamic visualization results using a newswire corpus and a remote sensing imagery sequence.

Show all abstracts

2003

Dynamic Visualization of Transient Data Streams

Wong PC, HP Foote, DR Adams, WE Cowley, and JJ Thomas. 2003. "Dynamic Visualization of Transient Data Streams." In IEEE Symposium on Information Visualization 2003. Proceedings IEEE Symposium Information Visualization, Seattle, WA.

Abstract

We introduce two dynamic visualization techniques using multi-dimensional scaling to analyze transient data streams such as newswires and remote sensing imagery. While the time-sensitive nature of these data streams requires immediate attention in many applications, the unpredictable and unbounded characteristics of this information can potentially overwhelm many scaling algorithms that require a full re-computation for every update. We present an adaptive visualization technique based on data stratification to ingest stream information adaptively when influx rate exceeds processing rate. We also describe an incremental visualization technique based on data fusion to project new information directly onto a visualization subspace spanned by the singular vectors of the previously processed neighboring data. The ultimate goal is to leverage the value of legacy and new information and minimize re-processing of the entire dataset in full resolution. We demonstrate these dynamic visualization results using a newswire corpus and a remote sensing imagery sequence.

Global Visualization and Alignments of Whole Bacterial Genomes

Wong PC, K Wong, HP Foote, and JJ Thomas. 2003. "Global Visualization and Alignments of Whole Bacterial Genomes." IEEE Transactions on Visualization and Computer Graphics 9(3):361-377.

Abstract

We present a novel visualization technique to align whole bacterial genomes with millions of nucleotides. Our basic design combines the descriptive power of pixel-based visualizations with the interpretative strength of digital image-processing filters. The innovative use of pixel enhancement techniques on pixel-based visualizations brings out the best of the recursive data patterns and further enhances the effectiveness of the visualization techniques. The result is a fast, versatile, and cost-effective analysis tool to reveal the functional identifications and the phenotypic changes of whole bacterial genomes. Our experiments show that our visualization-based genome alignment technique outperforms other computational-based tools in processing time. They also show that our pictorial results are far superior to the hardcopy printouts generated by computation-based programs in studying the overall genomic structures. Six different bacterial genomes obtained from public genome banks are used to demonstrate our designs and measure their performances.

Show all abstracts

2002

Multivariate Visualization with Data Fusion

Wong PC, HP Foote, DL Kao, LR Leung, and JJ Thomas. 2002. "Mulitvariate Visualization with Data Fusion." In Infomation Visualization, vol. 1, no. 3/4, ed. Chaomei Chen, pp. 182-193. MacMillan, Hampshire, United Kingdom.

Abstract

We discuss a fusion-based visualization method to analyze a 2D flow field together with its related scalars. The primary difference between a conventional visualization and a fusion-based visuali-zation is that the former draws on a single image whereas the latter draws on multiple see-through layers, which are then over-laid on each other to form the final visualization. We propose uniquely designed colormaps to highlight flow features that would not be shown with conventional colormaps. We present fusion techniques that integrate multiple single-purpose flow visualiza-tion techniques into the same viewing space. Our highly flexible fusion approach allows scientists to explore multiple parameters concurrently by mixing and matching images without frequently reconstructing new visualizations from its data for every possible combination. Sample datasets collected from a climate modeling study are used to demonstrate our approach.

ThemeRiver: Visualizing Thematic Changes in Large Document Collections

Havre S, E Hetzler, P Whitney, and L Nowell. "ThemeRiver: Visualizing Thematic Changes in Large Document Collections". IEEE Transactions on Visualization and Computer Graphics, Vol.8, No. 1, January-March 2002.

Abstract

The ThemeRiver visualization depicts thematic variations over time within a large collection of documents. The thematic changes are shown in the context of a timeline and corresponding external events. The focus on temporal thematic change whithin a context framework allows a user to discern patterns that suggest relationships or trends. For example, the sudden change of thematic strength following an external event may indicate a causal relationship. Such patterns are not readily accessible in other visualizations of the data. We use a river metaphor to convey several key notions. The document collection's time line, selected thematic content, and thematic strength are indicated by the river's directed flow, composition, and changing width, respectively. The directed flow from left to right is interpreted as movement through time and the horizontal distance between two points on the river defines a time interval. At any point in time, the vertical distance, or width, of the river indicates that collective strength of the selected themes. Colored "currents" flowing within the river represent individual themes. A current's vertical width narrows or broadens to indicate decreases or increases in the strength of the individual theme.

Show all abstracts

2001

Change blindness in information visualization: a case study

Nowell LT, EG Hetzler, and TE Tanasse. 2001. "Change Blindness in Information Visualization." October 22-23, 2001 Proceedings of the IEEE Information Visualization Symposium 2001 (InfoVis 2001), San Diego, CA.

Abstract

This paper introduces a graphical method for visually presenting and exploring the results of multiple queries simultaneously. This method allows a user to visually compare multiple query result sets, explore various combinations among the query result sets, and identify the "best" matches for combinations of multiple independent queries. This approach might also help users explore methods for progressively improving queries by visually comparing the improvement in result sets.

Interactive Visualization of Multiple Query Results

S. Havre, E. Hetzler, K. Perrine, E. Jurrus, and N. Miller. 2001."Interactive Visualization of Multiple Query Results." October 22-23, 2001 Proceedings of the IEEE Information Visualization Symposium 2001 (InfoVis 2001), San Diego, CA.

Abstract

This paper introduces a graphical method for visually presenting and exploring the results of multiple queries simultaneously. This method allows a user to visually compare multiple query result sets, explore various combinations among the query result sets, and identify the “best” matches for combinations of multiple independent queries. This approach might also help users explore methods for progressively improving queries by visually comparing the improvement in result sets.

Radical SAM, A Novel Protein Superfamily Linking Unresolved Steps in Familiar Biosynthetic Pathways with Radical Mechanisms: Functional Characterization Using New Analysis and Information Visualization Methods

Sofia HJ, G Chen, EG Hetzler, JF Reyes Spindola, and NE Miller. 2001. "Radical SAM, A Novel Protein Superfamily Linking Unresolved Steps in Familiar Biosynthetic Pathways with Radical Mechanisms: Functional Characterization Using New Analysis and Information Visualization Methods." Nucleic Acids Research 29(5):1097-1106.

Abstract

A large protein superfamily with over 500 members has been discovered and analyzed using powerful new bioinformatics and information visualization methods. Evidence exists that these proteins generate a 5?-deoxyadenosyl radical by reductive cleavage of S-adenosylmethionine (SAM) through an unusual Fe-S center. Radical SAM superfamily proteins function in DNA precursor, vitamin, cofactor, antibiotic, and herbicide biosynthesis in a collection of basic and familiar pathways. One of the members is interferon-inducible and is considered a candidate drug target for osteoporosis. The identification of this superfamily suggests that radical-based catalysis is important in a number of previously well-studied but unresolved biochemical pathways.

Discovering Knowledge Through Visual Analysis

Thomas JJ, PJ Cowley, OA Kuchar, LT Nowell, JR Thomson, and PC Wong. 2001. "Discovering Knowledge Through Visual Analysis." Journal of Universal Computer Science 7(6):517-529. doi:10.3217/jucs-007-06-0517

Abstract

This paper describes our vision for the near future in digital content analysis as it relates to the creation, verification, and presentation of knowledge. We focus on how visualization enables humans to make discoveries and gain knowledge. Visualization, in this context, is not just the picture representing the data but also a two-way interaction between humans and their information resources for the purposes of knowledge discovery, verification, and the sharing of knowledge with others. We present visual interaction and analysis examples to demonstrate how one current visualization tool analyzes large, diverse collections of text. This is followed by lessons learned and the presentation of a core concept for a new human information discourse.

Show all abstracts

2000

Data Signatures and Visualization of Very Large Datasets

Wong PC, H Foote, R Leung, D Adams, and J Thomas. 2000. Data Signatures and Visualization of Very Large Datasets. IEEE Computer Graphics and Applications, Vol 20, No 2, March 2000.

Abstract

Today, as data sets used in computations grow in size and complexity,the technologies developed over the years to deal with scientific data sets have become less efficient and effective. Many frequently used operations,such as Eigenvector computation, could quickly exhaust our desktop workstations once the data size reaches certain limits.

On the other hand,the high-dimensional data sets we collect every day don't relieve the problem. Many conventional metric designs that build on quantitative or categorical data sets cannot be applied directly to heterogeneous data sets with multiple data types. While building new machines with more resources might conquer the data size problems, the complexity of today's computations requires a new breed of projection techniques to support analysis of the data and verification of the results.

We introduce the concept of a data signature, which captures the essence of a scientific data set in a compact format, and use it to conduct analysis as if using the original. A time-dependent climate simulation data set demonstrates our approach and presents the results.

DriftWeed - A Visual Metaphor for Interactive Analysis of Multivariate Data

Rose S and PC Wong. 2000. DriftWeed - A Visual Metaphor for Interactive Analysis of Multivariate Data. Proceedings IS&T/SPIE Conference on Visual Data Exploration and Analysis, San Jose, CA, Jan 2000.

Abstract

We present a visualization technique that allows a user to identify and detect patterns and structures within a multivariate data set. Our research builds on previous efforts to represent multivariate data in a two-dimensional information display through the use of icon plots. Although the icon plot work done by Pickett and Grinstein is similar to our approach, we improve on their efforts in several ways.

Our technique allows analysis of a time series without using animation; promotes visual differentiation of information clusters based on measures of variance; and facilitates exploration through direct manipulation of geometry based on scales of variance.

Our goal is to provide a visualization that implicitly conveys the degree to which an element's ordered collection (pattern) of attributes varies from the prevailing pattern of attributes for other elements in the collection. We apply this technique to multivariate abstract data and use it to locate exceptional elements in a data set and divisions among clusters.

ThemeRiver: Visualizing Theme Changes over Time

Havre S, B Hetzler, and L Nowell. 2000. "ThemeRiver: Visualizing Theme Changes over Time", Proceedings of IEEE Symposium on Information Visualization, InfoVis 2000, pp. 115 - 123.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time within a large collection of documents. The "river" flows from left to right through time, changing width to depict changes in thematic strength of temporally associated documents. Colored "currents" flowing within the river narrow or widen to indicate decreases or increases in the strength of an individual topic or a group of topics in the associated documents. The river is shown within the context of a timeline and a corresponding textual presentation of external events.

Vector Fields Simplification - A Case Study of Visualizing Climate Modeling and Simulation Data Sets

Wong PC, H Foote, R Leung, E Jurrus, D Adams, and J Thomas. 2000. Vector Fields Simplification - A Case Study of Visualizing Climate Modeling and Simulation Data Sets. Proceedings IEEE Visualization 2000. Salt Lake City, Utah, Oct 8 - Oct 13, 2000.

Abstract

In our study of regional climate modeling and simulation, we frequently encounter vector fields that are crowded with large numbers of critical points. A critical point in a flow is where the vector field vanishes. While these critical points accurately reflect the topology of the vector fields, in our study only a subset of them is worth further investigation. We present a filtering technique based on the vorticity of the vector fields to eliminate the less interesting and sometimes sporadic critical points in a multi-resolution fashion. The neighboring regions of the preserved features, which are characterized by strong shear and circulation, are potential locations of weather instability. We apply our feature- filtering technique to a regional climate modeling data set covering East Asia in the summer of 1991.

Visualizing Sequential Patterns for Text Mining

Wong PC, W Cowley, H Foote, E Jurrus, and J Thomas. 2000. Visualizing Sequential Patterns for Text Mining. Proceedings IEEE Information Visualization 2000, Salt Lake City, Utah, Oct 8 - Oct 13, 2000.

Abstract

A sequential pattern in data mining is a finite series of elements such as A→B→C→D where A, B, C, and D are elements of the same domain. The mining of sequential patterns is designed to find patterns of discrete events that frequently happen in the same arrangement along a timeline. Like association and clustering, the mining of sequential patterns is among the most popular knowledge discovery techniques that apply statistical measures to extract useful information from large datasets. As our computers become more powerful, we are able to mine bigger datasets and obtain hundreds of thousands of sequential patterns in full detail. With this vast amount of data, we argue that neither data mining nor visualization by itself can manage the information and reflect the knowledge effectively. Subsequently, we apply visualization to augment data mining in a study of sequential patterns in large text corpora. The result shows that we can learn more and more quickly in an integrated visual data-mining environment.

Show all abstracts

1999

Visual Data Mining - Guest Editor's Introduction

Wong PC. 1999. Visual Data Mining - Guest Editor's Introduction. IEEE Computer Graphics and Applications, Vol 19, No 5, Sep 1999.

Abstract

Seeing is knowing, though merely seeing is not enough. When you understand what you see, seeing becomes believing. A while ago scientists discovered that seeing and understanding together enable humans to glean knowledge and deeper insight from large amounts of data. The approach integrates the human mind's exploration abilities with the enormous processing power of computers to form a powerful knowledge discovery environment that capitalizes on the best of both worlds. The technology builds on visual and analytical processes developed in various disciplines including scientific visualization, data mining, statistics, and machine learning with custom extensions that handle very large, multidimensional, multivariate data sets. The methodology is based on both functionality that characterizes structures and displays data and human capabilities that perceive patterns, exceptions, trends, and relationships. Here I'll define the vision, present the state of the art, and discuss the future of a young discipline called visual data mining.

Visualizing Association Rules for Text Mining

Wong PC, P Whitney, and J Thomas. 1999. Visualizing Association Rules for Text Mining. Proceedings IEEE Information Visualization 99, San Francisco, CA, Oct 24 - Oct 29, 1999.

Abstract

An association rule in data mining is an implication of the form X Y where X is a set of antecedent items and Y is the consequent item. For years researchers have developed many tools to visualize association rules. However, few of these tools can handle more than dozens of rules, and none of them can effectively manage rules with multiple antecedents. Thus, it is extremely difficult to visualize and understand the association information of a large data set even when all the rules are available. This paper presents a novel visualization technique to tackle many of these problems. We apply the technology to a text mining study on large corpora. The results indicate that our design can easily handle hundreds of multiple antecedent association rules in a three-dimensional display with minimum human interaction, low occlusion percentage, and no screen swapping.

ThemeRiver™: In Search of Trends, Patterns, and Relationships

Havre S, B Hetzler, and L Nowell. 1999. ThemeRiver™: In Search of Trends, Patterns, and Relationships. In Proceedings of IEEE Symposium on Information Visualization, InfoVis '99, October 25-26, San Francisco CA.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time across a collection of documents. The "river" flows through time, changing width to depict changes in the thematic strength of documents temporally collocated. Themes or topics are represented as colored "currents" flowing within the river that narrow or widen to indicate decreases or increases in the strength of a topic in associated documents at a specific point in time. The river is shown within the context of a timeline and a corresponding textual presentation of external events.

Human Computer Interaction with Global Information Spaces - Beyond Data Mining

Thomas J, K Cook, V Crow, B Hetzler, R May, D McQuerry, R McVeety, N Miller, G Nakamura, L Nowell, P Whitney, and PC Wong. 1999. Human Computer Interaction with Global Information Spaces - Beyond Data Mining. Pacific Northwest National Laboratory, Richland, WA 99352

Abstract

This invited paper describes a vision and progress towards a fundamentally new approach for dealing with the massive information overload situation of the emerging global information age. Today we use techniques such as data mining, through a WIMP interface, for searching or for analysis. Yet, the human mind can deal and interact simultaneously with millions of information items, e.g. documents. The challenge is to find visual paradigms, interaction techniques, and physical devices that encourage a new human information discourse between the human and their massive global and corporate information resources. After the vision, the current progress towards some core technology development, we present the grand challenges to bring this vision to reality.

Show all abstracts

1998

TOPIC ISLANDS™ - A Wavelet-Based Text Visualization System

Miller NE, PC Wong, M Brewster, and H Foote. 1998. TOPIC ISLANDS™ - A Wavelet Based Text Visualization System. In Proceedings of the conference on Visualization '98, pp. 189-196.

Abstract

We present a novel approach to visualize and explore unstructured text. The underlying technology, called TOPIC-O-GRAPHY™, applies wavelet transforms to a custom digital signal constructed from words within a document. The resultant multiresolution wavelet energy is used to analyze the characteristics of the narrative flow in the frequency domain, such as theme changes, which is then to the overall thematic content of the text document using statistical methods. The thematic characteristics of a document can be analyzed at varying degrees of detail, ranging from section-sized text partitions to partitions consisting of a few words. Using this technology, we are developing a visualization system prototype known as TOPIC ISLANDS™ to browse a document, generate fuzzy document outlines, summarize text by levels of detail and according to user interests, define meaningful subdocuments, query text content, and provide summaries of topic evolution.

Four Critical Elements for Designing Information Exploration Systems.

Hetzler B and N Miller. 1998. Four Critical Elements for Designing Information Exploration Systems. Presented at Information Exploration workshop for ACM SIGCHI '98. Los Angeles, CA. April 1998. PNNL-SA-29745

Abstract

Designing an information exploration system requires attention to four critical components. Since information exploration is a highly interactive process, the user is a key element. The second and third critical elements are the presentation methods that are used to communicate information and the interaction techniques that enable that user to actively explore that information. Finally, powerful mathematics are needed to identify and manipulate features of the information. This paper describes how these four critical components can work together to flexibly meet varied user goals.

Visualizing the Full Spectrum of Document Relationships

Hetzler B, WM Harris, S Havre , and P Whitney. 1998.Visualizing the Full Spectrum of Document Relationships. In Structures and Relations in Knowledge Organization. Proc. 5th Int. ISKO Conf. Wurzburg: ERGON Verlag, pp. 168-175.

Abstract

Documents embody a rich and potentially very useful set of complex interrelationships, both among the documents themselves and among the terms they contain. However, the very richness of these relationships and the variety of potential applications make it difficult to present them in a usable form. This paper describes an approach that enables the user to visualize a multitude of document or entity relationships. Two visual metaphors are presented that allow the user to gain new insights and understandings by interactively exploring these relationship patterns at multiple levels of detail.

Multi-faceted Insight Through Interoperable Visual Information Analysis Paradigms.

Hetzler B, P Whitney , L Martucci , and J Thomas. 1998. Multi-faceted Insight Through Interoperable Visual Information Analysis Paradigms. In Proceedings of IEEE Symposium on Information Visualization, InfoVis '98, October 19-20, 1998, Research Triangle Park, North Carolina. pp.137-144.

Abstract

To gain insight and understanding of complex information collections, users must be able to visualize and explore many facets of the information. This paper presents several novel visual methods from an information analyst's perspective. We present a sample scenario, using the various methods to gain a variety of insights from a large information collection. We conclude that no single paradigm or visual method is sufficient for many analytical tasks. Often a suite of integrated methods offers a better analytic environment in today's emerging culture of information overload and rapidly changing issues. We also conclude that the interactions among these visual paradigms are equally as important as, if not more important than, the paradigms themselves.

Show all abstracts

1997

Beyond Word Relations - SIGIR '97

Hetzler, E. 1997. Beyond Word Relations. SIGIR Forum, Fall 1997. Vol 31, No. 2. ACM Press, p. 28-32.

Abstract

Many information retrieval systems identify documents or provide a document visualization based on analysis of a particular relationship among documents — that of similar topical content. But there may be layers of other less apparent and less traditional relationships that are useful to the user. Exploring this other information was the subject of this workshop, with a focus on identifying new non-traditional relationships. An initial taxonomy was introduced and fleshed out during the workshop.

The Need For Metrics In Visual Information Analysis

Miller NE, G Nakamoto, B Hetzler , and P Whitney. 1997. The Need For Metrics In Visual Information Analysis. Workshop on New Paradigms in Information Visualization and Manipulation in conjunction with the Sixth ACM International Conference on Information and Knowledge Management (CIKM '97), November 13-14, 1997, Las Vegas Nevada, ACM Press

Abstract

This paper explores several methods for visualizing the thematic content of large document collections. As opposed to traditional query-driven document retrieval, these methods are used for exploring and gaining insight into document collections. For our experiments, we used 12,000 medical abstracts. The SPIRE [now IN-SPIRE] system was used to create the mathematical signal from text and to project the documents into a universe of "docustars" and as a thematic contour map based on thematic proximity. A self-organizing map is used to project the documents onto a "Tree" fractal. A topic-based approach is used to align documents between concepts in the "Cosmic Tumbleweed" projection. In the 32-D Hypercube, documents are organized by cascading theme strengths. An argument is made for a new type of metric that would facilitate comparisons among the many methods for visualizing or browsing document collections. An initial organization is proposed for some of the relevant research that metrics for information visualization can draw upon.

The STARLIGHT Information Visualization System

Risch, J.S., Rex, D.B., Dowson, S.T., Walters, T.B., May, R.A., and Moon, B.D., 1997, The STARLIGHT Information Visualization System, In: Proceedings of the 1997 IEEE Internal Information Visualization Conference (IV '97), August 27-29, 1997, London England.

Show all abstracts

1995

Visualizing the non-visual: spatial analysis and interaction with information from text documents

Wise, J.A.; Thomas, J.J.; Pennock, K.; Lantrip, D.; Pottier, M.; Schur, A.; Crow, V., "Visualizing the non-visual: spatial analysis and interaction with information from text documents," Information Visualization, 1995. Proceedings. , vol., no., pp.51,58, 30-31 Oct. 1995

Abstract

The paper describes an approach to IV that involves spatializing text content for enhanced visual browsing and analysis. The application arena is large text document corpora such as digital libraries, regulations and procedures, archived reports, etc. The basic idea is that text content from these sources may be transformed to a spatial representation that preserves informational characteristics from the documents. The spatial representation may then be visually browsed and analyzed in ways that avoid language processing and that reduce the analysts mental workload. The result is an interaction with text that more nearly resembles perception and action with the natural world than with the abstractions of written language.

Core Areas

Resources

Select a Year

Illuminating the Path
Illuminating the Path: The Research and Development Agenda for Visual Analytics. Download