Skip to Main Content U.S. Department of Energy
Information Visualization

Papers

Links to papers are to non-PNNL sites, some of which require a subscription or charge a fee to access the full text of papers.

Show all abstracts

2011

Graph Analytics—Lessons Learned and Challenges Ahead

Pak Chung Wong, Chaomei Chen, Carsten Gorg, Ben Shneiderman, John Stasko, Jim Thomas, Graph Analytics—Lessons Learned and Challenges Ahead, IEEE Computer Graphics and Applications, vol. 31, no. 5, pp. 18-29, Sep./Oct. 2011, doi:10.1109/MCG.2011.72

Abstract

Graph analytics is one of the most influential and important R&D topics in the visual analytics community. Researchers with diverse backgrounds from information visualization, human-computer interaction, computer graphics, graph drawing, and data mining have pursued graph analytics from scientific, technical, and social approaches. These studies have addressed both distinct and common challenges. Past successes and mistakes can provide valuable lessons for revising the research agenda. In this article, six researchers from four academic and research institutes identify graph analytics' fundamental challenges and present both insightful lessons learned from their experience and good practices in graph analytics research. The goal is to critically assess those lessons and shed light on how they can stimulate research and draw attention to grand challenges for graph analytics. The article also establishes principles that could lead to measurable standards and criteria for research.

HPC 2011 - A Highly Parallel Implementation of K-Means for Multithreaded Architecture

Mackey PS, JT Feo, PC Wong, and Y Chen. 2011. "A Highly Parallel Implementation of K-Means for Multithreaded Architecture." In 19th High Performance Computing Symposium (HPC 2011): SCS Spring Simulation Multiconference (SpringSim 2011), April 3-7, 2011, Boston, MA. ACM , New York, NY.

Abstract

We present a parallel implementation of the popular k-means clustering algorithm for massively multithreaded computer systems, as well as a parallelized version of the KKZ seed selection algorithm. We demonstrate that as system size increases, sequential seed selection can become a bottleneck. We also present an early attempt at parallelizing k-means that highlights critical performance issues when programming massively multithreaded systems. For our case studies, we used data collected from electric power simulations and run on the Cray XMT.

Collaborative Visualization: Definition, Challenges, and Research Agenda

Isenberg P, N Elmqvist, J Scholtz, D Cernea, KL Ma, and H Hagen. 2011. Collaborative Visualization: Definition, Challenges, and Research Agenda. Information Visualization. doi:10.1177/1473871611412817

Abstract

Collaborative visualization has emerged as a new research direction which offers the opportunity to reach new audiences and application areas for visualization tools and techniques. Technology now allows us to easily connect and collaborate with one another—in settings as diverse as over networked computers, across mobile devices, or using shared displays such as interactive walls and tabletop surfaces. Any of these collaborative settings carries a set of challenges and opportunities for visualization research. Digital information is already regularly accessed by multiple people together in order to share information, to view it together, to analyze it, or to form decisions. However, research on how to best support collaboration with and around visualizations is still in its infancy and has so far focused only on a small subset of possible application scenarios. The purpose of this article is (1) to provide a clear scope, definition, and overview of the evolving field of collaborative visualization, (2) to help pinpoint the unique focus of collaborative visualization with its specific aspects, challenges, and requirements within the intersection of general computer-supported collaborative work (CSCW) and visualization research, and (3) to draw attention to important future research questions to be addressed by the community. Thus, the goal of the paper is to discuss a research agenda for future work on collaborative visualization, including our vision for how to meet the grand challenge and to urge for a new generation of visualization tools that were designed with collaboration in mind from their very inception.

Report on the Dagstuhl Seminar on Visualization and Monitoring of Network Traffic

Keim D, A Pras, J Schonwalder, PC Wong, and F Mansmann. 2011. Report on the Dagstuhl Seminar on Visualization and Monitoring of Network Traffic. Journal of Network and Systems Management 18(2):232-236. doi:10.1007/s10922-010-9161-1

Abstract

The Dagstuhl Seminar on Visualization and Monitoring of Network Traffic [1] took place May 17-20, 2009 in Dagstuhl, Germany. Dagstuhl seminars promote personal interaction and open discussion of results as well as new ideas. Unlike at most conferences, the focus is not solely on the presentation of established results but to equal parts on results, ideas, sketches, and open problems. The aim of this particular seminar was to bring together experts from the information visualization community and the networking community in order to discuss the state of the art of monitoring and visualization of network traffic. People from the different research communities involved jointly organized the seminar. The co-chairs of the seminar from the networking community were Aiko Pras (University of Twente) and Jürgen Schönwälder (Jacobs University Bremen). The co-chairs from the visualization community were Daniel A. Keim (University of Konstanz) and Pak Chung Wong (Pacific Northwest National Lab). Florian Mansmann (University of Konstanz) helped with producing this report. The seminar was organized and supported by Schloss Dagstuhl and the EC IST-EMANICS Network of Excellence

Show all abstracts

2010

Automatic Keyword Extraction from Individual Documents

Rose SJ, DW Engel, NO Cramer, and WE Cowley. 2010. Automatic Keyword Extraction from Individual Documents. Chapter 1 in Text Mining: Application and Theory, vol. 1, ed. MWBerry, J Kogan, pp. 3-20. John Wiley & Sons, Chichester, United Kingdom.

Abstract

This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method's configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

Events and Trends in Text Streams

Engel DW, PD Whitney, and NO Cramer. 2010. Events and Trends in Text Streams. Chapter 9 in Text Mining: Application and Theory, vol. 1, ed. MWBerry, J Kogan, pp. 3-20. John Wiley & Sons, Chichester, United Kingdom.

Abstract

Text streams--collections of documents or messages that are generated and observed over time--are ubiquitous. Our research and development are targeted at developing algorithms to find and characterize changes in topic within text streams. To date, this research has demonstrated the ability to detect and describe 1) short duration, atypical events and 2) the emergence of longer-term shifts in topical content. This technology has been applied to predefined temporally ordered document collections but is also suitable for application to near-real-time textual data streams.

Real-Time Visualization of Network Behaviors for Situational Awareness

Best DM, SJ Bohn, DV Love, AS Wynne, and WA Pike. 2010. Real-Time Visualization of Network Behaviors for Situational Awareness. In Proceedings of the Seventh International Symposium on Visualization for Cyber Security, pp. 79-90. ACM , New York, NY. doi:10.1145/1850795.1850805

Abstract

Plentiful, complex, and dynamic data make understanding the state of an enterprise network difficult. Although visualization can help analysts understand baseline behaviors in network traffic and identify off-normal events, visual analysis systems often do not scale well to operational data volumes (in the hundreds of millions to billions of transactions per day) nor to analysis of emergent trends in real-time data. We present a system that combines multiple, complementary visualization techniques coupled with in-stream analytics, behavioral modeling of network actors, and a high-throughput processing platform called MeDICi. This system provides situational understanding of real-time network activity to help analysts take proactive response steps. We have developed these techniques using requirements gathered from the government users for which the tools are being developed. By linking multiple visualization tools to a streaming analytic pipeline, and designing each tool to support a particular kind of analysis (from high-level awareness to detailed investigation), analysts can understand the behavior of a network across multiple levels of abstraction.

Developing Qualitative Metrics for Visual Analytic Environments

Scholtz J. 2010. Developing Qualitative Metrics for Visual Analytic Environments. In BELIV '10: Beyond time and errors: novel evaluation methods for Information Visualization, A Workshop of the ACM CHI Conference, April 10-11, 2010, Atlanta, Georgia. Association for Computing Machinery, New York, NY.

Abstract

In this paper, we examine reviews for the entries to the 2009 Visual Analytics Science and Technology (VAST) Challenge. By analyzing these reviews we gained a better understanding of what is important to our reviewers, both visualization researchers and professional analysts. This is a bottom up approach to the development of heuristics to use in the evaluation of visual analytic environments. The meta-analysis and the results are presented in this paper.

Multimedia Analysis + Visual Analytics = Multimedia Analytics

Chinchor N, JJ Thomas, PC Wong, M Christel, and MW Ribarsky. 2010. Multimedia Analysis plus Visual Analytics = Multimedia Analytics. IEEE Computer Graphics and Applications 30(5):52-60. doi:10.1109/MCG.2010.92

Abstract

Multimedia analysis has focused on images, video, and to some extent audio and has made progress in single channels excluding text. Visual analytics has focused on the user interaction with data during the analytic process plus the fundamental mathematics and has continued to treat text as did its precursor, information visualization. The general problem we address in this tutorial is the combining of multimedia analysis and visual analytics to deal with multimedia information gathered from different sources, with different goals or objectives, and containing all media types and combinations in common usage.

High-Throughput Real-Time Network Flow Visualization

Best DM, DV Love, WA Pike, and SJ Bohn. 2010. High-Throughput Real-Time Network Flow Visualization. FloCon2010, New Orleans, LA. PNNL-SA-69233.

Abstract

This presentation and demonstration will introduce two interactive, high-throughput visual analysis tools, Traffic Circle and CLIQUE, and will discuss the analytic requirements of the U.S. government cyber security capabilities for which the tools were developed and are being deployed. Both tools take a time-based approach to visual analysis, with Traffic Circle displaying raw data and CLIQUE computing real-time behavioral models. Performance benchmarks will also be discussed; the tools are currently capable of ingesting and presenting data volumes on the order of hundreds of millions of flow records at once.

A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis

Jin S, Z Huang, Y Chen, D Chavarria-Miranda, JT Feo, and PC Wong. 2010. A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), pp. 1-7. Institute of Electrical and Electronics Engineers, Piscataway, NJ. doi:10.1109/IPDPS.2010.5470400

Abstract

In Energy Management Systems, contingency analysis is commonly performed for identifying and mitigating potentially harmful power grid component failures. The exponentially increasing combinatorial number of failure modes imposes a significant computational burden for massive contingency analysis. It is critical to select a limited set of high-impact contingency cases within the constraint of computing power and time requirements to make it possible for real-time power system vulnerability assessment. In this paper, we present a novel application of parallel betweenness centrality to power grid contingency selection. We cross-validate the proposed method using the model and data of the western US power grid, and implement it on a Cray XMT system - a massively multithreaded architecture - leveraging its advantages for parallel execution of irregular algorithms, such as graph analysis. We achieve a speedup of 55 times (on 64 processors) compared against the single-processor version of the same code running on the Cray XMT. We also compare an OpenMP-based version of the same code running on an HP Superdome shared-memory machine. The performance of the Cray XMT code shows better scalability and resource utilization, and shorter execution time for large-scale power grids. This proposed approach has been evaluated in PNNL's Electricity Infrastructure Operations Center (EIOC). It is expected to provide a quick and efficient solution to massive contingency selection problems to help power grid operators to identify and mitigate potential widespread cascading power grid failures in real time.

A Multi-Phase Network Situational Awareness Cognitive Task Analysis

Erbacher R, DA Frincke, PC Wong, S Moody, and GA Fink. 2010. A Multi-Phase Network Situational Awareness Cognitive Task Analysis. Information Visualization 9(3):204-219.

Abstract

The goal of our project is to create a set of next-generation cyber situational-awareness capabilities with applications to other domains in the long term. The objective is to improve the decision-making process to enable decision makers to choose better actions. To this end, we put extensive effort into making certain that we had feedback from network analysts and managers and understand what their genuine needs are. This article discusses the cognitive task-analysis methodology that we followed to acquire feedback from the analysts. This article also provides the details we acquired from the analysts on their processes, goals, concerns, the data and metadata that they analyze. Finally, we describe the generation of a novel task-flow diagram representing the activities of the target user base.

Cognitive Task Analysis of Network Analysts and Managers for Network Situational Awareness

Erbacher R, DA Frincke, PC Wong, S Moody, and GA Fink. 2010. Cognitive Task Analysis of Network Analysts and Managers for Network Situational Awareness. In Proceedings of the SPIE: Visualization and Data Analysis 2010, vol. 7530, ed. J Park, MC Hao, PC Wong and C Chen, p. Art No.: 75300H. SPIE, Bellingham, WA. doi:10.1117/12.845488

Abstract

The goal of the project was to create a set of next generation cyber situational awareness capabilities with applications to other domains in the long term. The goal is to improve the decision making process such that decision makers can choose better actions. To this end, we put extensive effort into ensuring we had feedback from network analysts and managers and understood what their needs truly were. Consequently, this is the focus of this portion of the research. This paper discusses the methodology we followed to acquire this feedback from the analysts, namely a cognitive task analysis. Additionally, this paper provides the details we acquired from the analysts. This essentially provides details on their processes, goals, concerns, the data and meta-data they analyze, etc. A final result we describe is the generation of a task-flow diagram.

Show all abstracts

2009

A Novel Visualization Technique for Electric Power Grid Analytics

Pak Chung Wong; Schneider, K.; Mackey, P.; Foote, H.; Chin, G.; Guttromson, R.; Thomas, J. 2009 A Novel Visualization Technique for Electric Power Grid Analytics. Visualization and Computer Graphics, IEEE Transactions on vol.15, no.3, pp.410-423, May-June 2009

Abstract

The application of information visualization holds tremendous promise for the electric power industry, but its potential has so far not been sufficiently exploited by the visualization community. Prior work on visualizing electric power systems has been limited to depicting raw or processed information on top of a geographic layout. Little effort has been devoted to visualizing the physics of the power grids, which ultimately determines the condition and stability of the electricity infrastructure. Based on this assessment, we developed a novel visualization system prototype, GreenGrid, to explore the planning and monitoring of the North American Electricity Infrastructure. The paper discusses the rationale underlying the GreenGrid design, describes its implementation and performance details, and assesses its strengths and weaknesses against the current geographic-based power grid visualization. We also present a case study using GreenGrid to analyze the information collected moments before the last major electric blackout in the Western United States and Canada, and a usability study to evaluate the practical significance of our design in simulated real-life situations. Our result indicates that many of the disturbance characteristics can be readily identified with the proper form of visualization.

Designing a Collaborative Visual Analytics Tool for Social and Technological Change Prediction.

Wong PC, LYR Leung, N Lu, MJ Scott, PS Mackey, HP Foote, J Correia, Jr, ZT Taylor, J Xu, SD Unwin, and AP Sanfilippo. 2009. Designing a Collaborative Visual Analytics Tool for Social and Technological Change Prediction. IEEE Computer Graphics and Applications 29(5):58-68. doi:10.1109/MCG.2009.92.

Abstract

We describe our ongoing efforts to design and develop a collaborative visual analytics tool to interactively model social and technological change of our society in a future setting. The work involves an interdisciplinary team of scientists from atmospheric physics, electrical engineering, building engineering, social sciences, economics, public policy, and national security. The goal of the collaborative tool is to predict the impact of global climate change on the U.S. power grids and its implications for society and national security. These future scenarios provide critical assessment and information necessary for policymakers and stakeholders to help formulate a coherent, unified strategy toward shaping a safe and secure society. The paper introduces the problem background and related work, explains the motivation and rationale behind our design approach, presents our collaborative visual analytics tool and usage examples, and finally shares the development challenge and lessons learned from our investigation.

The Scalable Reasoning System: Lightweight Visualization for Distributed Analytics

Pike, William.; Bruce, Joe.; Baddeley, Bob.; Best, Daniel.; Franklin, Lyndsey.; May, Richard.; Rice, Douglas.: Riensche, Rick.; Younkin, Katarina The Scalable Reasoning System: Lightweight Visualization for Distributed Analytics. Information Visualization. Vol. 8, no. 1, pp. 71-84. Spring 2009

Abstract

A central challenge in visual analytics is the creation of accessible, widely distributable analysis applications that bring the benefits of visual discovery to as broad a user base as possible. Moreover, to support the role of visualization in the knowledge creation process, it is advantageous to allow users to describe the reasoning strategies they employ while interacting with analytic environments. We introduce an application suite called the scalable reasoning system (SRS), which provides web-based and mobile interfaces for visual analysis. The service-oriented analytic framework that underlies SRS provides a platform for deploying pervasive visual analytic environments across an enterprise. SRS represents a 'lightweight' approach to visual analytics whereby thin client analytic applications can be rapidly deployed in a platform-agnostic fashion. Client applications support multiple coordinated views while giving analysts the ability to record evidence, assumptions, hypotheses and other reasoning artifacts. We describe the capabilities of SRS in the context of a real-world deployment at a regional law enforcement organization.

Application and Evaluation of Analytic Gaming

Riensche RM, LM Martucci, J Scholtz, and MA Whiting. 2009. Application and Evaluation of Analytic Gaming. In 2009 International Conference on Computational Science and Engineering, August 29-31, 2009, Vancouver, Canada, vol. 4, pp. 1169-1173. IEEE Computer Society, Los Alamitos, CA. doi:10.1109/CSE.2009.250

Abstract

We describe an "analytic gaming" framework and methodology, and introduce formal methods for evaluation of the analytic gaming process. This process involves conception, development, and playing of games that are informed by predictive models and driven by players. Evaluation of analytic gaming examines both the process of game development and the results of game play exercises.

The Science of Interaction

Pike WA, JT Stasko, R Chang, and T O'Connell. 2009. The Science of Interaction. Information Visualization 8(4):263-274. doi:10.1057/ivs.2009.22

Abstract

There is a growing recognition with the visual analytics community that interaction and inquiry are inextricable. It is through the interactive manipulation of a visual interface - the analytic discourse - that knowledge is constructed, tested, refined, and shared. This paper reflects on the interaction challenges raised in the original visual analytics research and development agenda and further explores the relationship between interaction and cognition. It identifies recent exemplars of visual analytics research that have made substantive progress toward the goals of a true science of interaction, which must include theories and testable premises about the most appropriate mechanisms for human-information interaction. Six areas for further work are highlighted as those among the highest priorities for the next five years of visual analytics research: ubiquitous, embodied interaction; capturing user intentionality; knowledge-based interfaces; principles of design and perception; collaboration; and interoperability. Ultimately, the goal of a science of interaction is to support the visual analytics community through the recognition and implementation of best practices in the representation of and interaction with visual displays.

The Science of Analytic Reporting

Chinchor N, and WA Pike. 2009. The Science of Analytic Reporting. Information Visualization 8(4):286-293.

Abstract

The challenge of visually communicating analysis results is central to the ability of visual analytics tools to support decision making and knowledge construction. The benefit of emerging visual methods will be improved through more effective exchange of the insights generated through the use of visual analytics. This paper outlines the major requirements for next-generation reporting systems in terms of eight major research needs: the development of best practices, design automation, visual rhetoric, context and audience, connecting analysis to presentation, evidence and argument, collaborative environments, and interactive and dynamic documents. It also describes an emerging technology called Active Products that introduces new techniques for analytic process capture and dissemination.

Visual Analytics Technology Transition Progress

Scholtz J, KA Cook, MA Whiting, DK Lemon, and H Greenblatt. 2009. Visual Analytics Technology Transition Progress. Information Visualization 8(4) (sp. iss.):294-301.

Abstract

The authors provide a description of the transition process for visual analytic tools and contrast this with the transition process for more traditional software tools. This paper takes this into account and describes a user-oriented approach to technology transition including a discussion of key factors that should be considered and adapted to each situation. The progress made in transitioning visual analytic tools in the past five years is described and the challenges that remain are enumerated.

Challenges for Visual Analytics

Thomas JJ, and J Kielman. 2009. Challenges for Visual Analytics. Information Visualization 8(4) (Sp. Iss. SI):309-314.

Abstract

Visual analytics has seen unprecedented growth in its first five years of mainstream existence. Great progress has been made in a short time, yet great challenges must be met in the next decade to provide new technologies that will be widely accepted by societies throughout the world. This paper sets the stage for some of those challenges in an effort to provide the stimulus for the research, both basic and applied, to address and exceed the envisioned potential for visual analytics technologies. We start with a brief summary of the initial challenges, followed by a discussion of the initial driving domains and applications, as well as additional applications and domains that have been a part of recent rapid expansion of visual analytics usage. We look at the common characteristics of several tools illustrating emerging visual analytics technologies, and conclude with the top ten challenges for the field of study. We encourage feedback and collaborative participation by members of the research community, the wide array of user communities, and private industry.

Foundations and Frontiers in Visual Analytics

Kielman J, JJ Thomas, and RA May, II. 2009. Foundations and Frontiers in Visual Analytics. Information Visualization. 8(4):239-246.

Abstract

This introduction and future vision section for this special issue of the Journal of Information Visualization hopes to set the stage for an emerging worldwide effort to advance the state of the science of visual analytics. We present some of the driving needs followed by some foundational principals and methods for advancing this science through partnerships among national laboratories, academia, industry, and the international science community. We will present a selection of the many success stories the science, engineering, and industrial communities have of taking core science research to end users in the field during these early years. Next, we will present some thoughts on the future vision. Finally, we will introduce the 8 papers in this special issue, each one addressing part of that vision.

Visual Analytics: Building a Vibrant and Resilient National Science

Wong PC, and JJ Thomas. 2009. Visual Analytics: Building a Vibrant and Resilient National Science. Information Visualization 8(4) (Sp. Iss. SI):302-308.

Abstract

Five years after the science of visual analytics was formally established, we attempt to use two different studies to assess the current state of the community and evaluate the progress the community has made in the past few years. The first study involves a comparison analysis of intellectual and scholastic accomplishments recently made by the visual analytics community. The second one aims to measure the degree of community reach and internet penetration of visual-analytics-related resources. This paper describes our efforts to harvest the study data, conduct analysis, and make interpretations based on parallel comparisons with five other established computer science areas.

A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics

Wong PC, PS Mackey, KA Cook, RM Rohrer, HP Foote, and MA Whiting. 2009. A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 147 - 154. IEEE , Piscataway, NJ. doi:10.1109/VAST.2009.5333880

Abstract

This paper presents a working graph analytics model that embraces the strengths of the traditional top-down and bottom-up approaches with a resilient crossover concept to exploit the vast middle-ground information overlooked by the two extreme analytical approaches. Our graph analytics model is developed in collaboration with researchers and users, who carefully studied the functional requirements that reflect the critical thinking and interaction pattern of a real-life intelligence analyst. To evaluate the model, we implement a system prototype, known as GreenHornet, which allows our analysts to test the theory in practice, identify the technological and usage-related gaps in the model, and then adapt the new technology in their work space. The paper describes the implementation of GreenHornet and compares its strengths and weaknesses against the other prevailing models and tools.

Describing Story Evolution from Dynamic Information Streams

Rose SJ, RS Butner, WE Cowley, ML Gregory, and J Walker. 2009. Describing Story Evolution from Dynamic Information Streams. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST) VAST 2009, Oct. 12-13, 2009, Atlantic City, NJ, pp. 99-106. IEEE , Piscataway, NJ. doi:10.1109/VAST.2009.5333437

Abstract

Sources of streaming information, such as news syndicates, publish information continuously. Information portals and news aggregators list the latest information from around the world enabling information consumers to easily identify events in the past 24 hours. The volume and velocity of these streams causes information from prior days' to quickly vanish despite its utility in providing an informative context for interpreting new information. Few capabilities exist to support an individual attempting to identify or understand trends and changes from streaming information over time. The burden of retaining prior information and integrating with the new is left to the skills, determination, and discipline of each individual. In this paper we present a visual analytics system for linking essential content from information streams over time into dynamic stories that develop and change over multiple days. We describe particular challenges to the analysis of streaming information and explore visual representations for showing story change and evolution over time.

VAST Contest Dataset Use in Education

Whiting MA, C North, A Endert, J Scholtz, JN Haack, CF Varley, and JJ Thomas. 2009. VAST Contest Dataset Use in Education. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 115 - 122. IEEE, Piscataway, NJ. doi:10.1109/VAST.2009.5333245

Abstract

The IEEE Visual Analytics Science and Technology (VAST) Symposium has held a contest each year since its inception in 2006. These events are designed to provide visual analytics researchers and developers with analytic challenges similar to those encountered by professional information analysts. The VAST contest has had an extended life outside of the symposium, however, as materials are being used in universities and other educational settings, either to help teachers of visual analytics-related classes or for student projects. We describe how we develop VAST contest datasets that results in products that can be used in different settings and review some specific examples of the adoption of the VAST contest materials in the classroom. The examples are drawn from graduate and undergraduate courses at Virginia Tech and from the Visual Analytics "Summer Camp" run by the National Visualization and Analytics Center in 2008. We finish with a brief discussion on evaluation metrics for education.

Two-stage Framework for Visualization of Clustered High Dimensional Data

Choo J, SJ Bohn, and H Park. 2009. Two-stage Framework for Visualization of Clustered High Dimensional Data. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST). PNNL-SA-65520, Pacific Northwest National Laboratory, Richland, WA. [Unpublished]

Abstract

In this paper, we discuss 2D visualization methods of high dimensional representation of the data that are clustered and their associated label information is available. We propose a two-stage framework for visualization of such data based on dimension reduction methods. In the first stage, we obtain the reduced dimensional data by a supervised dimension reduction method such as linear discriminant analysis that preserves the original cluster structure in terms of its criterion. The resulting optimal reduced dimension depends on the optimization criteria and is often larger than 2. In the second stage, in order to further reduce the dimension to 2 for visualization purposes, we apply another dimension reduction method such as principal component analysis that minimizes the distortion in the lower dimensional representation of the data obtained in the first stage. Using this framework, we propose several two-stage methods, and present their theoretical characteristics as well as experimental comparisons on both artificial and real-world text data sets.

Analytics for Massive Heat Maps

Love, D.; Bohn, Shawn.; Payne, Deborah.; Nakamura, Grant. 2009. Analytics for Massive Heat Maps. SPIE Visualization and Data Analysis conference, San Jose, 19 January 2009.

Abstract

High throughput instrumentation for genomics is producing data orders of magnitude greater than even a decade before. Biologists often visualize the data of these experiments through the use of heat maps. For large datasets, heat map visualizations do not scale. These visualizations are only capable of displaying a portion of the data, making it difficult for scientists to find and detect patterns that span more than a subsection of the data. We present a novel method that provides an interactive visual display for massive heat maps [O(108)]. Our process shows how a massive heat map can be decomposed into multiple levels of abstraction to represent the underlying macrostructures. We aggregate these abstractions into a framework that can allow near real-time navigation of the space. To further assist pattern discovery, we ground our system on the principle of focus+context. Our framework also addresses the issue of balancing the memory and display resolution and heat map size. We will show that this technique for biologists provides a powerful new visual metaphor for analyzing massive datasets.

User-Centered Evaluation of Technosocial Predictive Analysis

Scholtz J.; Whiting M. 2009. User-Centered Evaluation of Technosocial Predictive Analysis. Association for the Advancement of Artificial Intelligence 2009

Abstract

In today's technology filled world, it is absolutely essential to show the utility of new software, especially software that brings entirely new capabilities to potential users. In the case of technosocial predictive analytics, researchers are developing software capabilities to augment human reasoning and cognition. Getting acceptance and buy-in from analysts and decision makers will not be an easy task. In this position paper, we discuss an approach we are taking for user-centered evaluation that we believe will result in facilitating the adoption of technosocial predictive software by the intelligence community.

Predicting the Impact of Climate Change on U.S. Power Grids and Its Wider Implications on National Security

Wong, P.C.; Ruby Leung, L R.; Lu, N.; Paget, M.; Correia, J Jr.; Jiang, W.; Mackey, P.; Taylor, T. Z.; Xie, Y.; Xu, J.; Unwin, S.; Sanfilippo, A. 2009. Predicting the Impact of Climate Change on U.S. Power Grids and Its Wider Implications on National Security. Association for the Advancement of Artificial Intelligence

Abstract

We discuss our technosocial analytics research and development on predicting and assessing the impact of climate change on U.S. power-grids and the wider implications for national security. The ongoing efforts extend cutting-edge modeling theories derived from climate, energy, social sciences, and national security domains to form a unified system coupled with an interactive visual interface for technosocial analysis. The goal of the system is to create viable future scenarios that address both technical and social factors involved in the model domains. These scenarios enable policymakers to formulate a coherent, unified strategy towards building a safe and secure society. The paper gives an executive summary of our preliminary efforts in the past year and provides a glimpse of our work planned for the second year of a multi-year project being conducted at the Pacific Northwest National Laboratory.

Managing Complex Network Operation with Predictive Analytics

Huang, Z.; Wong, P.C.; Mackey, P.; Chen, Y.; Jian Ma, J.; Schneider, K.; Greitzer, F. L. 2009 Association for the Advancement of Artificial Intelligence

Visual analytics for law enforcement: deploying a service-oriented analytic framework for web-based visualization

Dowson, Scott T.; Bruce, Joe; Best, Daniel M.; Riensche, Roderick M.; Franklin, Lyndsey; Pike, William A. 2009. "Visual analytics for law enforcement: deploying a service-oriented analytic framework for web-based visualization. Association for the Advancement of Artificial Intelligence Proc. SPIE, Vol. 7346, 734603 2009

Abstract

This paper presents key components of the Law Enforcement Information Framework (LEIF), an information system that provides communications, situational awareness, and visual analytics tools in a service-oriented architecture supporting web-based desktop and handheld device users. LEIF simplifies interfaces and visualizations of well-established visual analytic techniques to improve usability. Advanced analytics capability is maintained by enhancing the underlying processing to support the new interface. LEIF development is driven by real-world user feedback gathered through deployments at three operational law enforcement organizations in the U.S. The system incorporates a robust information ingest pipeline supporting a wide variety of information formats. LEIF also insulates interface and analytical components from information sources making it easier to adapt the framework for many different data repositories.

Show all abstracts

2008

A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs

Wong PC, HP Foote, PS Mackey, G Chin, Jr, HJ Sofia, and JJ Thomas. 2008. "A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs." Information Visualization 7:105-117.

Abstract

We present an information visualization tool, known as GreenMax, to visually explore large small-world graphs with up to a million graph nodes on a desktop computer. A major motivation for scanning a small-world graph in such a dynamic fashion is the demanding goal of identifying not just the well-known features but also the unknown–known and unknown–unknown features of the graph. GreenMax uses a highly effective multilevel graph drawing approach to pre-process a large graph by generating a hierarchy of increasingly coarse layouts that later support the dynamic zooming of the graph. This paper describes the graph visualization challenges, elaborates our solution, and evaluates the contributions of GreenMax in the larger context of visual analytics on large small-world graphs. We report the results of two case studies using GreenMax and the results support our claim that we can use GreenMax to locate unexpected features or structures behind a graph.

BioGraphE: High-performance bionetwork analysis using the Biological Graph Environment

Chin G, Jr, D Chavarría-Miranda, GC Nakamura, and HJ Sofia. 2008. "BioGraphE: High-performance bionetwork analysis using the Biological Graph Environment." BMC Bioinformatics.

Abstract

We introduce a computational framework for graph analysis called the Biological Graph Environment (BioGraphE), which provides a general, scalable integration platform for connecting graph problems in biology to optimized computational solvers and high-performance systems. This framework enables biology researchers and computational scientists to identify and deploy network analysis applications and to easily connect them to efficient and powerful computational software and hardware that are specifically designed and tuned to solve complex graph problems. In our particular application of BioGraphE to support network analysis in genome biology, we investigate the use of a Boolean satisfiability solver known as Survey Propagation as a core computational solver executing on standard high-performance parallel systems, as well as multi- threaded architectures.

Bringing A Vector/Image Conflation Tool To The Commercial Market

Martucci LM, and B Kovalerchuk. 2008. "Bringing A Vector/Image Conflation Tool To The Commercial Market." In American Society of Photogrammetry and Remote Sensing (ASPRS) 2008 Annual Conference. American Society of Photogrammetry and Remote Sensing (ASPRS), Washington, DC.

Abstract

This paper addresses the conflation problem of integrating/aligning/fusing vector and image data in geospatial products, with special focus on the aspect of bringing a solution to the commercial market. Users of geospatial data in government, military, industry, research, and other sectors have need for accurate displays of information such as roads and other terrain information in areas of interest and operations. Our general approach to vector/raster conflation examines the problem in three activity areas: preprocessing, conflation processing, and postprocessing. We use two well-developed and complementary methodologies with the goal to integrate them into a unified framework for an optimized conflation solution. This research is conducted within an Army Small Business Innovation (SBIR) project with the critically important aspect of pursuing a technology transfer and commercialization strategy that would result in a likely pathway for transition into an operational capability. We describe fundamental principles and generalized roles of participants in the commercialization process. Further, we introduce the concept of putting technically sound products to beneficial use through the steps of (i) defining the specific use scenarios and the respective operational/business environment of that use, and (ii) performing product marketing in accordance with use scenarios and the stimulation of related environments. Several sample scenarios are presented, along with operating/business environments, to demonstrate the concept. The approach assesses the technological readiness of the user for a vector/raster product with a view towards application of a more penetrating market analysis that attempts to pinpoint the technology transition opportunities in a complex and ever expanding geospatial data arena.

Progress and Challenges in Evaluating Tools for Sensemaking

Scholtz JC. 2008. "Progress and Challenges in Evaluating Tools for Sensemaking." Presented at the ACM Computer Human Information (CHI) conference Workshop on Sensemaking in Florence, Italy, April 6, 2008.

Abstract

In this paper we discuss current work and challenges for the development of metrics to evaluate software designed to help analysts with sensemaking activities. While much of the work we describe has been done in the context of intelligence analysis, we are also concerned with the general applicability of metrics and evaluation methodologies for other analytic domains.

Show all abstracts

2007

Fast Point-Feature Label Placement for Dynamic Visualizations

Mote KD. 2008. "Fast Point-Feature Label Placement for Dynamic Visualizations." Information Visualization 6(4):249-260

Putting Security in Context: Visual Correlation of Network Activity with Real-World Information

Pike WA, SJ Zabriskie, and C Scherrer. 2007. "Putting Security in Context: Visual Correlation of Network Activity with Real-World Information." In Workshop on Visualization for Computer Security 2007 (VizSEC 07). PNNL-SA-57153, Pacific Northwest National Laboratory, Richland, WA.

Scalable Visual Analytics of Massive Textual Datasets

Krishnan M, SJ Bohn, WE Cowley, VL Crow, and J Nieplocha. 2007. "Scalable Visual Analytics of Massive Textual Datasets." In IEEE International Parallel & Distributed Processing Symposium. Long Beach, CA, March 26-30, 2007.

Abstract

This paper describes the first scalable implementation of text processing engine used in Visual Analytics tools. These tools aid information analysts in interacting with and understanding large textual information content through visual interfaces. By developing parallel implementation of the text processing engine, we enabled visual analytics tools to exploit cluster architectures and handle massive dataset. The paper describes key elements of our parallelization approach and demonstrates virtually linear scaling when processing multi-gigabyte data sets such as Pubmed. This approach enables interactive analysis of large datasets beyond capabilities of existing state-of-the art visual analytics tools.

Visual Analysis of Weblog Content

Gregory ML, DA Payne, D McColgin, NO Cramer, and DV Love. 2006. "Visual Analysis of Weblog Content." In International Conference on Weblogs and Social Media '07. pp. 227-230. Boulder, March 26-28, 2007.

Abstract

In recent years, one of the advances of the World Wide Web is social media and one of the fastest growing aspects of social media is the blogosphere. Blogs make content creation easy and are highly accessible through web pages and syndication. With their growing influence, a need has arisen to be able to monitor the opinions and insight revealed within their content. This paper describes a technical approach for analyzing the content of blog data using a visual analytic tool, IN-SPIRE, developed by Pacific Northwest National Laboratory. We will describe both how an analyst can explore blog data with IN-SPIRE and how the tool could be modified in the future to handle the specific nuances of analyzing blog data.

Visual Analytics Science and Technology

Wong PC. 2007. "Visual Analytics Science and Technology." Information Visualization 2007(6):1-2.

Show all abstracts

2006

Diverse Information Integration and Visualization

Havre SL, A Shah, C Posse, and BM Webb-Robertson. 2006."Diverse Information Integration and Visualization." In Visualization and Data Analysis 2006 (EI10). SPIE The International Society for Optical Engineering, San Jose, CA.

Abstract

This paper presents and explores a technique for visually integrating and exploring diverse information. Society produces, collects, and processes ever larger and diverse data including semi- and un-structured text, as well as transaction, communication, and scientific data. It is no longer sufficient to analyze one type of data or information in isolation. Users need to explore their data/information in the context of related information to discover often hidden, but meaningful, complex relationships. Our approach visualizes multiple, like entities across multiple dimensions where each dimension is a partitioning of the entities. The partitioning may be based on inherent or assigned attributes of the entities (or entity data) such as meta-data or prior knowledge captured in annotations. The partitioning may also be derived from entity data. For example, clustering, or unsupervised classification, can be applied to arrays of multidimensional entity data to partition the entities into groups of similar entities, or clusters. The same entities may be clustered on data from different experiment types or processing approaches. This reduction of diverse data/information on an entity to a series of partitions, or discrete (and unit-less) categories, allows the user to view the entities across a variety of data without concern for data types and units. Parallel coordinates visualize entity data across multiple dimensions of typically continuous attributes. We adapt parallel coordinates for dimensions with discrete attributes (partitions) to allow the comparison of entity partition patterns for identifying trends and outlier entities. We illustrate this approach through a prototype, Juxter (short for Juxtaposer).

From Question Answering to Visual Exploration

McColgin DW, ML Gregory, EG Hetzler, and AE Turner. 2006. "From Question Answering to Visual Exploration." In Proceedings of the ACM SIGIR workshop on Evaluating Exploratory Search Systems, pp. 47-50. Seattle, August 10, 2006.

Abstract

Research in Question Answering has focused on the quality of information retrieval or extraction using the metrics of precision and recall to judge success; these metrics drive toward finding the specific best answer(s) and are best supportive of a lookup type of search. These do not address the opportunity that users' natural language questions present for exploratory interactions. In this paper, we present an integrated Question Answering environment that combines a visual analytics tool for unstructured text and a state-of-the-art query expansion tool designed to compliment the cognitive processes associated with an information analysts work flow. Analysts are seldom looking for factoid answers to simple questions; their information needs are much more complex in that they may be interested in patterns of answers over time, conflicting information, and even related non-answer data may be critical to learning about a problem or reaching prudent conclusions. In our visual analytics tool, questions result in a comprehensive answer space that allows users to explore the variety within the answers and spot related information in the rest of the data. The exploratory nature of the dialog between the user and this system requires tailored evaluation methods that better address the evolving user goals and counter cognitive biases inherent to exploratory search tasks.

Generating Graphs for Visual Analytics through Interactive Sketching

Wong PC, HP Foote, PS Mackey, KA Perrine, and G Chin, JR. 2006. "Generating Graphs for Visual Analytics through Interactive Sketching." IEEE Transactions on Visualization and Computer Graphics Volume 12(Number 6):, doi:10.1109/TVCG.2006.91

Abstract

We introduce an interactive graph generator, GreenSketch, designed to facilitate the creation of descriptive graphs required for different visual analytics tasks. The human-centric design approach of GreenSketch enables users to master the creation process without specific training or prior knowledge of graph model theory. The customized user interface encourages users to gain insight into the connection between the compact matrix representation and the topology of a graph layout when they sketch their graphs. Both the human-enforced and machine-generated randomnesses supported by GreenSketch provide the flexibility needed to address the uncertainty factor in many analytical tasks. This paper describes over two dozen examples that cover a wide variety of graph creations from a single line of nodes to a real-life small-world network that describes a snapshot of telephone connections. While the discussion focuses mainly on the design of GreenSketch, we include a case study that applies the technology in a visual analytics environment and a usability study that evaluates the strengths and weaknesses of our design approach.

Graph Signatures for Visual Analytics

Wong PC, HP Foote, G Chin, JR, PS Mackey, and KA Perrine. 2006. "Graph Signatures for Visual Analytics." IEEE Transactions on Visualization and Computer Graphics 12(6):, doi:10.1109/TVCG.2006.92

Abstract

We present a visual analytics technique to explore graphs using the concept of a data signature. A data signature, in our context, is a multidimensional vector that captures the local topology information surrounding each graph node. Signature vectors extracted from a graph are projected onto a low-dimensional scatterplot through the use of scaling. The resultant scatterplot, which reflects the similarities of the vectors, allows analysts to examine the graph structures and their corresponding real-life interpretations through repeated use of brushing and linking between the two visualizations. The interpretation of the graph structures is based on the outcomes of multiple participatory analysis sessions with intelligence analysts conducted by the authors at the Pacific Northwest National Laboratory. The paper first uses three public domain datasets with either well-known or obvious features to explain the rationale of our design and illustrate its results. More advanced examples are then used in a customized usability study to evaluate the effectiveness and efficiency of our approach. The study results reveal not only the limitations and weaknesses of the traditional approach based solely on graph visualization but also the advantages and strengths of our signature-guided approach presented in the paper.

Have Green - A Visual Analytics Framework for Large Semantic Graphs

Wong PC, G Chin, Jr, HP Foote, PS Mackey, and JJ Thomas. 2006. "Have Green - A Visual Analytics Framework for Large Semantic Graphs." In IEEE Symposium on Visual Analytics Science and Technology, pp 67-74. Baltimore, Maryland, October 31-November 2, 2006.

Abstract

A semantic graph is a network of heterogeneous nodes and links annotated with a domain ontology. In intelligence analysis, investigators use semantic graphs to organize concepts and relationships as graph nodes and links in hopes of discovering key trends, patterns, and insights. However, as new information continues to arrive from a multitude of sources, the size and complexity of the semantic graphs will soon overwhelm an investigator's cognitive capacity to carry out significant analyses. We introduce a powerful visual analytics framework designed to enhance investigators' natural analytical capabilities to comprehend and analyze large semantic graphs. The paper describes the overall framework design, presents major development accomplishments to date, and discusses future directions of a new visual analytics system known as Have Green.

Walking the Path-A New Journey to Explore and Discover through Visual Analytics

Wong PC, SJ Rose, G Chin, Jr, D Frincke, RA May, II, C Posse, AP Sanfilippo, and JJ Thomas. 2006. "Walking the Path-A New Journey to Explore and Discover through Visual Analytics." Information Visualization 5(4):237-249. doi:10.1057/palgrave.ivs.9500133

Abstract

Visual representations are essential aids to human cognitive tasks and are valued to the extent that they provide stable and external reference points upon which dynamic activities and thought processes may be calibrated and upon which models and theories can be tested and confirmed. The active use and manipulation of visual representations makes many complex and intensive cognitive tasks feasible. As described in the recently published "Illuminating the Path", visual analytics is "the science of analytical reasoning facilitated by interactive visual interfaces." We describe research and development at PNNL focused on improving the value that interactive visual representations provide to persons engaged in complex cognitive tasks. We describe work at PNNL that carries forward research from multiple disciplines with a goal to improve the capability of visual representations and present examples whose aim is to improve the extraction, and reasoning about information, knowledge, and data.

Show all abstracts

2005

A Typology for Visualizing Uncertainty

Thomson JR, EG Hetzler, A MacEachren, MN Gahegan, and M Pavel. 2005. "A Typology for Visualizing Uncertainty." In Visualization and Data Analysis 2005, Published in Proceedings of the SPIE, vol. 5669, pp. 146-157. SPIE, IS&T, San Jose, CA.

Abstract

Information analysts must rapidly assess information to determine its usefulness in supporting and informing decision makers. In addition to assessing the content, the analyst must also be confident about the quality and veracity of the information. Visualizations can concisely represent vast quantities of information thus aiding the analyst to examine larger quantities of material; however visualization programs are challenged to incorporate a notion of confidence or certainty because the factors that influence the certainty or uncertainty of information vary with the type of information and the type of decisions being made. For example, the assessment of potentially subjective human-reported data leads to a large set of uncertainty concerns in fields such as national security, law enforcement (witness reports), and even scientific analysis where data is collected from a variety of individual observers. What's needed is a formal model or framework for describing uncertainty as it relates to information analysis, to provide a consistent basis for constructing visualizations of uncertainty. This paper proposes an expanded typology for uncertainty, drawing from past frameworks targeted at scientific computing. The typology provides general categories for analytic uncertainty, a framework for creating task-specific refinements to those categories, and examples drawn from the national security field.

Bioinformatic Insights from Metagenomics through Visualization

Havre SL, BM Webb-Robertson, A Shah, C Posse, B Gopalan, and FJ Brockman. 2005. "Bioinformatic Insights from Metagenomics through Visualization." In Proceedings of the IEEE Computational Systems Bioinformatics Conference (CSB 2005). August 8-11, 2005, pp. 341-350. IEEE Computer Society, Los Alamitos, CA.

Abstract

Cutting-edge biological and bioinformatics research seeks a systems perspective through the analysis of multiple types of high-throughput and other experimental data for the same sample. Systems-level analysis requires the integration and fusion of such data, typically through advanced statistics and mathematics. Visualization is a complementary com-putational approach that supports integration and analysis of complex data or its derivatives. We present a bioinformatics visualization prototype, Juxter, which depicts categorical information derived from or assigned to these diverse data for the purpose of comparing patterns across categorizations. The visualization allows users to easily discern correlated and anomalous patterns in the data. These patterns, which might not be detected automatically by algorithms, may reveal valuable information leading to insight and discovery. We describe the visualization and interaction capabilities and demonstrate its utility in a new field, metagenomics, which combines molecular biology and genetics to identify and characterize genetic material from multi-species microbial samples.

Building a Human Information Discourse Interface to Uncover Scenario Content

Sanfilippo AP, BL Baddeley, AJ Cowell, ML Gregory, RE Hohimer, and SC Tratz. 2005. "Building a Human Information Discourse Interface to Uncover Scenario Content." In 2005 International Conference on Intelligence Analysis . Mitre Website, McLean, VA.

Dynamic Visualization of Graphs with Extended Labels

Wong PC, PS Mackey, KA Perrine, JR Eagan, HP Foote, and J Thomas. 2005. "Dynamic Visualization of Graphs with Extended Labels." In 2005 IEEE Symposium on Information Visualization, Los Alamitos, CA, October 2005, pp. 73-80. IEEE, Piscataway, NJ.

Abstract

The paper describes a novel technique to visualize graphs with extended node and link labels. The lengths of these labels range from a short phrase to a full sentence to an entire paragraph and beyond. Our solution is different from all the existing approaches that almost always rely on intensive computational effort to optimize the label placement problem. Instead, we share the visualization resources with the graph and present the label information in static, interactive, and dynamic modes without the requirement for tackling the intractability issues. This allows us to reallocate the computational resources for dynamic presentation of real-time information. The paper includes a user study to evaluate the effectiveness and efficiency of the visualization technique.

Extending the Reach of Augmented Cognition To Real-World Decision Making Tasks

Greitzer FL. 2005. "Extending the Reach of Augmented Cognition To Real-World Decision Making Tasks." In Augmented Cognition International Conference. HCI-International, Las Vegas.

Abstract

The focus of this paper is on the critical challenge of bridging the gap between psychophysiological sensor data and the inferred cognitive states of users. It is argued that a more robust behavioral data collection foundation will facilitate accurate inferences about the state of the user so that an appropriate mitigation strategy, if needed, can be applied. The argument for such a foundation is based on two premises: (1) To realize the envisioned impact of augmented cognition systems, the technology should be applied to a broad, and more cognitively complex, range of real-world problems. (2) To support identifying cognitive states for more complex, real-world tasks, more sophisticated instrumentation will be needed for behavioral data collection. It is argued that such instrumentation would enable inferences to be made about higher-level semantic aspects of performance. The paper describes how instrumentation software developed to support information analysis R&D may serve as an integration environment that can provide additional behavioral data, in context, to facilitate inferences of cognitive state that will enable the successful augmenting of cognitive performance.

InfoStar: An Adaptive Visual Analytics Platform for Mobile Devices

Sanfilippo AP, RA May, II, GR Danielson, RM Riensche, and BL Baddeley. 2005. "InfoStar: An Adaptive Visual Analytics Platform for Mobile Devices." In First International Workshop on Managing Context Information in Mobile and Pervasive Environments. CEUR-WS.org, Ayia Napa, Cyprus.

Abstract

We present the design and implementation of InfoStar, an adaptive Visual Analytics platform for mobile devices such a PDAs, laptops, Tablet PCs and mobile phones. InfoStar extends the reach of visual analytics technology beyond the traditional desktop paradigm to provide ubiquitous access to inter-active visualizations of information spaces. These visualizations are critical in addressing the knowledge needs of human agents operating in the field, in areas as diverse as business, homeland security, law enforcement, protective services, emergency medical services and scientific discovery. We describe an initial real world deployment of this technology, in which the InfoStar platform has been used to offer mobile access to scheduling and venue information to conference attendees at Supercomputing 2004.

Metrics and Measures for Intelligence Analysis Task Difficulty

Greitzer FL, and KM Allwein. 2005. "Metrics and Measures for Intelligence Analysis Task Difficulty ." In First International Conference on Intelligence Analysis Methods and Tools . MITRE Corp, McLean, VA.

Abstract

Recent workshops and conferences supporting the intelligence community (IC) have highlighted the need to characterize the difficulty or complexity of intelligence analysis (IA) tasks in order to facilitate assessments of the impact or effectiveness of IA tools that are being considered for introduction into the IC. Some fundamental issues are: (a) how to employ rigorous methodologies in evaluating tools, given a host of problems such as controlling for task difficulty, effects of time or learning, small-sample size limitations; (b) how to measure the difficulty/complexity of IA tasks in order to establish valid experimental/quasi-experimental designs aimed to support evaluation of tools; and (c) development of more rigorous (summative), performance-based measures of human performance during the conduct of IA tasks, beyond the more traditional reliance on formative assessments (e.g., subjective ratings). Invited discussants will be asked to comment on one or more of these issues, with the aim of bringing the most salient issues and research needs into focus.

New Challenges Facing Integrative Biological Science in the Post-Genomic Era

Oehmen CS, T Straatsma, GA Anderson, G Orr, BM Webb-Robertson, RC Taylor, RW Mooney, DJ Baxter, DR Jones, and DA Dixon. 2005. "New Challenges Facing Integrative Biological Science in the Post-Genomic Era." Journal of Biological Systems.

Abstract

The future of biology will be increasingly driven by the fundamental paradigm shift from hypothesis-driven research to data-driven discovery research employing the massive amounts of available biological data. We identify key technological developments needed to enable this paradigm shift involving (1) the ability to store and manage extremely large datasets which are dispersed over a wide geographical area, (2) development of novel analysis and visualization tools which are capable of operating on enormous data resources without overwhelming researchers with unusable information, and (3) formalisms for integrating mathematical models of biosystems from the molecular level to the organism population level. This will require the development of tools which efficiently utilize high-performance compute power, large storage infrastructures and large aggregate memory architectures. The end result will be the ability of a researcher to integrate complex data from many different sources with simulations to analyze a given system at a wide range of temporal and spatial scales in a single conceptual model.

Turning the Bucket of Text into a Pipe

Hetzler EG, VL Crow, DA Payne, and AE Turner. 2005. "Turning the Bucket of Text into a Pipe." In Proceedings of the IEEE Symposium on Information Visualization. INFOVIS 2005. 23-25 Oct. 2005, pp. 89-94. IEEE, Los Alamitos, CA.

Abstract

Many visual analysis tools operate on a fixed set of data. However, professional information analysts follow issues over a period of time, and need to be able to easily add the new documents to an ongoing exploration. Some analysts handle documents in a moving window of time, with new documents constantly added and old ones aging out. This paper describes both the user interaction and the technical implementation approach for a visual analysis system designed to support constantly evolving text collections.

Scientist-Centered Graph-Based Models of Scientific Knowledge

Chin G, JR, EG Stephan, DK Gracio, OA Kuchar, PD Whitney, and KL Schuchardt. 2005. "Scientist-Centered Graph-Based Models of Scientific Knowledge." In HCI International 2005. 11th International Conference on Human-Computer Interaction, 22-27, July 2005, Caesars Palace, Las Vegas, Nevada USA., p. 10 pages. Lawrence Erlbaum and Associates, Mahwah, NJ.

Abstract

At the Pacific Northwest National Laboratory, we are researching and developing visual models and paradigms that will allow scientists to capture and represent conceptual models in a computational form that may linked to and integrated with scientific data sets and applications. Captured conceptual models may be logical in conveying how individual concepts tie together to form a higher theory, analytical in conveying intermediate or final analysis results, or temporal in describing the experimental process in which concepts are physically and computationally explored. In this paper, we describe and contrast three different research and development systems that allow scientists to capture and interact with computational graph-based models of scientific knowledge. Through these examples, we explore and examine ways in which researchers may graphically encode and apply scientific theory and practice on computer systems.

Top Ten Needs for Intelligence Analysis Tool Development

Badalamente RV, and FL Greitzer. "Top Ten Needs for Intelligence Analysis Tool Development." 2005. In First International Conference on Intelligence Analysis Methods and Tools. MITRE Corp, McLean, VA.

Abstract

The purpose of this paper is to report on the results of R&D to generate ideas about future enhancements to software systems designed to aid the process of intelligence analysis (IA). Use of IA tools in actual settings has revealed significant problems: the user's thought process has not been adequately modeled and is therefore not reflected in the design of analysis tools; users find the tools difficult to learn and use; the tools are not tailored to specific intelligence domains; the tools do not offer an integrated approach (data preprocessing/ingest is a particular problem); the tools do not address the longitudinal nature (continuing over extended periods of time) of the general analysis problem. The aim of this work was to establish an enduring, well-integrated, robust technical foundation for the development and deployment of information-technology (IT)-based IA tools recognized by users and clients as uniquely well designed to meet their varied analysis needs. An overarching strategy or "roadmap" is needed to guide technology development, and a more accurate understanding is needed about how real intelligence analysts do their job. To address these needs, we conducted a facilitated workshop with nine working analysts. An intelligence analysis process model was developed and discussed with the analysts as a point of departure for the discussion. Participants worked in break-out groups to discuss concepts for tools and enhanced products to aid in the IA process. The top ten enhancements identified during the workshop were: seamless data access and ingest; diverse data ingest and fusion; shared electronic folders for collaborative analysis; hypothesis generation and tracking; template for analysis strategy; electronic skills inventory; dynamic data processing and visualization; intelligent tutor for intelligence product development; imagery data resources; intelligence analysis knowledge base. This paper and presentation will discuss the conduct of the workshop and the results obtained.

Toward the Development of Cognitive Task Difficulty Metrics to Support Intelligence Analysis Research

Greitzer FL. 2005. "Toward the Development of Cognitive Task Difficulty Metrics to Support Intelligence Analysis Research." In The Fourth IEEE Conference on Cognitive Informatics, Aug. 8-10, 2005. ICCI 2005, pp. 315-320. Institute of Electrical and Electronics Engineers, Piscataway, NJ.

Abstract

Intelligence analysis is a cognitively complex task that is the subject of considerable research aimed at developing methods and tools to aid the analysis process. To support such research, it is necessary to characterize the difficulty or complexity of intelligence analysis tasks in order to facilitate assessments of the impact or effectiveness of tools that are being considered for deployment. A number of informal accounts of "What makes intelligence analysis hard" are available, but there has been no attempt to establish a more rigorous characterization with well-defined difficulty factors or dimensions. This paper takes an initial step in this direction by describing a set of proposed difficulty metrics based on cognitive principles.

Visual Sample Plan (VSP) Software: Designs and Data Analyses for Sampling Contaminated Buildings

Pulsipher BA, JE Wilson, RO Gilbert, LL Nuffer, and NL Hassig. 2005. "Visual Sample Plan (VSP) Software: Designs and Data Analyses for Sampling Contaminated Buildings." In Proceedings of 24th Annual National Conference on Managing Environmental Quality Systems , vol. 24-2-2, pp. 24-34. US EPA, Washington, DC.

Abstract

A new module of the Visual Sample Plan (VSP) software has been developed to provide sampling designs and data analyses for potentially contaminated buildings. An important application is assessing levels of contamination in buildings after a terrorist attack. This new module, funded by DHS through the Combating Terrorism Technology Support Office, Technical Support Working Group, was developed to provide a tailored, user-friendly and visually-orientated buildings module within the existing VSP software toolkit, the latest version of which can be downloaded from http://dqo.pnl.gov/vsp. In case of, or when planning against, a chemical, biological, or radionuclide release within a building, the VSP module can be used to quickly and easily develop and visualize technically defensible sampling schemes for walls, floors, ceilings, and other surfaces to statistically determine if contamination is present, its magnitude and extent throughout the building and if decontamination has been effective. This paper demonstrates the features of this new VSP buildings module, which include: the ability to import building floor plans or to easily draw, manipulate, and view rooms in several ways; being able to insert doors, windows and annotations into a room; 3-D graphic room views with surfaces labeled and floor plans that show building zones that have separate air handing units. The paper will also discuss the statistical design and data analysis options available in the buildings module. Design objectives supported include comparing an average to a threshold when the data distribution is normal or unknown, and comparing measurements to a threshold to detect hotspots or to insure most of the area is uncontaminated when the data distribution is normal or unknown.

Show all abstracts

2004

Analysis Experiences Using Information Visualization

Hetzler, E. and Turner A. 2004. "Analysis experiences using information visualization." IEEE Computer Graphics and Applications, 24:5, pp. 22-26.

Abstract

To deliver truly useful tools, researchers must learn how to map between the knowledge domains inherent in information collections and the knowledge domains in users' minds. The true measure of this work is not what the software shows, but what the user is able to understand by using it. This article summarizes lessons learned from an observational study of the application of the In-Spire visually-oriented text exploitation system in an operational analysis environment.

Supporting Mutual Understanding in a Visual Dialogue Between Analyst and Computer

Chappell AR, AJ Cowell, DA Thurman, and JR Thomson. 2004. "Supporting Mutual Understanding in a Visual Dialogue Between Analyst and Computer." In HFES 2004 proceedings of the Human Factors and Ergonomics Society 48th Annual Meeting: September 20-24, 2004, New Orleans, Louisiana, p. 5 Human Factors & Ergonomics Society, Santa Monica, AB, Canada.

Abstract

The Knowledge Associates for Novel Intelligence (KANI) project is developing a system of automated "associates" to actively support and participate in the information analysis task. The primary goal of KANI is to use automatically extracted information in a reasoning system that draws on the strengths of both a human analyst and automated reasoning. The interface between the two agents is a key element in achieving this goal. The KANI interface seeks to support a visual dialogue with mixed-initiative manipulation of information and reasoning components. To be successful, the interface must achieve mutual understanding between the analyst and KANI of the other's actions. Toward this mutual understanding, KANI allows the analyst to work at multiple levels of abstraction over the reasoning process, links the information presented across these levels to make use of interaction context, and provides querying facilities to allow exploration and explanation.

Visual Analytics

Wong PC, and J Thomas. 2004. "Visual Analytics." IEEE Computer Graphics and Applications, 24:5 pp20-21.

Excerpt:

The information revolution is upon us, and it's guaranteed to change our lives and the way we conduct our daily business. The fact that we have to deal with not just the size but also the variety and complexity of this information makes it a real challenge to survive the revolution. Enter visual analytics, a contemporary and proven approach to combine the art of human intuition and the science of mathematical deduction to directly perceive patterns and derive knowledge and insight from them.

Visual analytics is the formation of abstract visual metaphors in combination with a human information discourse (interaction) that enables detection of the expected and discovery of the unexpected within massive, dynamically changing information spaces. These suites of technologies apply to almost all fields but are being driven by critical needs in biology and national security...

Visualizing Data Streams

Wong PC, HP Foote, DR Adams, WE Cowley, LR Leung, and JJ Thomas. 2004. "Visualizing Data Streams." Chapter 11 in Visual and Spatial Analysis: Advances in Data Mining, Reasoning, and Problem Solving, ed. Boris Kovalerchuk and James Schwing, pp. 265-291,568,569,570,571. Springer, Dordrecht, Netherlands.

Abstract

We introduce two dynamic visualization techniques using multi-dimensional scaling to analyze transient data streams such as newswires and remote sensing imagery. While the time-sensitive nature of these data streams requires immediate attention in many applications, the unpredictable and unbounded characteristics of this information can potentially overwhelm many scaling algorithms that require a full re-computation for every update. We present an adaptive visualization technique based on data stratification to ingest stream information adaptively when influx rate exceeds processing rate. We also describe an incremental visualization technique based on data fusion to project new information directly onto a visualization subspace spanned by the singular vectors of the previously processed neighboring data. The ultimate goal is to leverage the value of legacy and new information and minimize re-processing of the entire dataset in full resolution. We demonstrate these dynamic visualization results using a newswire corpus and a remote sensing imagery sequence.

Show all abstracts

2003

Dynamic Visualization of Transient Data Streams

Wong PC, HP Foote, DR Adams, WE Cowley, and JJ Thomas. 2003. "Dynamic Visualization of Transient Data Streams." In IEEE Symposium on Information Visualization 2003. Proceedings IEEE Symposium Information Visualization, Seattle, WA.

Abstract

We introduce two dynamic visualization techniques using multi-dimensional scaling to analyze transient data streams such as newswires and remote sensing imagery. While the time-sensitive nature of these data streams requires immediate attention in many applications, the unpredictable and unbounded characteristics of this information can potentially overwhelm many scaling algorithms that require a full re-computation for every update. We present an adaptive visualization technique based on data stratification to ingest stream information adaptively when influx rate exceeds processing rate. We also describe an incremental visualization technique based on data fusion to project new information directly onto a visualization subspace spanned by the singular vectors of the previously processed neighboring data. The ultimate goal is to leverage the value of legacy and new information and minimize re-processing of the entire dataset in full resolution. We demonstrate these dynamic visualization results using a newswire corpus and a remote sensing imagery sequence.

Global Visualization and Alignments of Whole Bacterial Genomes

Wong PC, K Wong, HP Foote, and JJ Thomas. 2003. "Global Visualization and Alignments of Whole Bacterial Genomes." IEEE Transactions on Visualization and Computer Graphics 9(3):361-377.

Abstract

We present a novel visualization technique to align whole bacterial genomes with millions of nucleotides. Our basic design combines the descriptive power of pixel-based visualizations with the interpretative strength of digital image-processing filters. The innovative use of pixel enhancement techniques on pixel-based visualizations brings out the best of the recursive data patterns and further enhances the effectiveness of the visualization techniques. The result is a fast, versatile, and cost-effective analysis tool to reveal the functional identifications and the phenotypic changes of whole bacterial genomes. Our experiments show that our visualization-based genome alignment technique outperforms other computational-based tools in processing time. They also show that our pictorial results are far superior to the hardcopy printouts generated by computation-based programs in studying the overall genomic structures. Six different bacterial genomes obtained from public genome banks are used to demonstrate our designs and measure their performances.

Show all abstracts

2002

Multivariate Visualization with Data Fusion

Wong PC, HP Foote, DL Kao, LR Leung, and JJ Thomas. 2002. "Mulitvariate Visualization with Data Fusion." In Infomation Visualization, vol. 1, no. 3/4, ed. Chaomei Chen, pp. 182-193. MacMillan, Hampshire, United Kingdom.

Abstract

We discuss a fusion-based visualization method to analyze a 2D flow field together with its related scalars. The primary difference between a conventional visualization and a fusion-based visuali-zation is that the former draws on a single image whereas the latter draws on multiple see-through layers, which are then over-laid on each other to form the final visualization. We propose uniquely designed colormaps to highlight flow features that would not be shown with conventional colormaps. We present fusion techniques that integrate multiple single-purpose flow visualiza-tion techniques into the same viewing space. Our highly flexible fusion approach allows scientists to explore multiple parameters concurrently by mixing and matching images without frequently reconstructing new visualizations from its data for every possible combination. Sample datasets collected from a climate modeling study are used to demonstrate our approach.

ThemeRiver: Visualizing Thematic Changes in Large Document Collections

Havre S, E Hetzler, P Whitney, and L Nowell. "ThemeRiver: Visualizing Thematic Changes in Large Document Collections". IEEE Transactions on Visualization and Computer Graphics, Vol.8, No. 1, January-March 2002.

Abstract

The ThemeRiver visualization depicts thematic variations over time within a large collection of documents. The thematic changes are shown in the context of a timeline and corresponding external events. The focus on temporal thematic change whithin a context framework allows a user to discern patterns that suggest relationships or trends. For example, the sudden change of thematic strength following an external event may indicate a causal relationship. Such patterns are not readily accessible in other visualizations of the data. We use a river metaphor to convey several key notions. The document collection's time line, selected thematic content, and thematic strength are indicated by the river's directed flow, composition, and changing width, respectively. The directed flow from left to right is interpreted as movement through time and the horizontal distance between two points on the river defines a time interval. At any point in time, the vertical distance, or width, of the river indicates that collective strength of the selected themes. Colored "currents" flowing within the river represent individual themes. A current's vertical width narrows or broadens to indicate decreases or increases in the strength of the individual theme.

Show all abstracts

2001

Change blindness in information visualization: a case study

Nowell LT, EG Hetzler, and TE Tanasse. 2001. "Change Blindness in Information Visualization." October 22-23, 2001 Proceedings of the IEEE Information Visualization Symposium 2001 (InfoVis 2001), San Diego, CA.

Abstract

This paper introduces a graphical method for visually presenting and exploring the results of multiple queries simultaneously. This method allows a user to visually compare multiple query result sets, explore various combinations among the query result sets, and identify the "best" matches for combinations of multiple independent queries. This approach might also help users explore methods for progressively improving queries by visually comparing the improvement in result sets.

Interactive Visualization of Multiple Query Results

S. Havre, E. Hetzler, K. Perrine, E. Jurrus, and N. Miller. 2001."Interactive Visualization of Multiple Query Results." October 22-23, 2001 Proceedings of the IEEE Information Visualization Symposium 2001 (InfoVis 2001), San Diego, CA.

Abstract

This paper introduces a graphical method for visually presenting and exploring the results of multiple queries simultaneously. This method allows a user to visually compare multiple query result sets, explore various combinations among the query result sets, and identify the “best” matches for combinations of multiple independent queries. This approach might also help users explore methods for progressively improving queries by visually comparing the improvement in result sets.

Radical SAM, A Novel Protein Superfamily Linking Unresolved Steps in Familiar Biosynthetic Pathways with Radical Mechanisms: Functional Characterization Using New Analysis and Information Visualization Methods

Sofia HJ, G Chen, EG Hetzler, JF Reyes Spindola, and NE Miller. 2001. "Radical SAM, A Novel Protein Superfamily Linking Unresolved Steps in Familiar Biosynthetic Pathways with Radical Mechanisms: Functional Characterization Using New Analysis and Information Visualization Methods." Nucleic Acids Research 29(5):1097-1106.

Abstract

A large protein superfamily with over 500 members has been discovered and analyzed using powerful new bioinformatics and information visualization methods. Evidence exists that these proteins generate a 5?-deoxyadenosyl radical by reductive cleavage of S-adenosylmethionine (SAM) through an unusual Fe-S center. Radical SAM superfamily proteins function in DNA precursor, vitamin, cofactor, antibiotic, and herbicide biosynthesis in a collection of basic and familiar pathways. One of the members is interferon-inducible and is considered a candidate drug target for osteoporosis. The identification of this superfamily suggests that radical-based catalysis is important in a number of previously well-studied but unresolved biochemical pathways.

Show all abstracts

2000

Data Signatures and Visualization of Very Large Datasets

Wong PC, H Foote, R Leung, D Adams, and J Thomas. 2000. Data Signatures and Visualization of Very Large Datasets. IEEE Computer Graphics and Applications, Vol 20, No 2, March 2000.

Abstract

Today, as data sets used in computations grow in size and complexity,the technologies developed over the years to deal with scientific data sets have become less efficient and effective. Many frequently used operations,such as Eigenvector computation, could quickly exhaust our desktop workstations once the data size reaches certain limits.

On the other hand,the high-dimensional data sets we collect every day don't relieve the problem. Many conventional metric designs that build on quantitative or categorical data sets cannot be applied directly to heterogeneous data sets with multiple data types. While building new machines with more resources might conquer the data size problems, the complexity of today's computations requires a new breed of projection techniques to support analysis of the data and verification of the results.

We introduce the concept of a data signature, which captures the essence of a scientific data set in a compact format, and use it to conduct analysis as if using the original. A time-dependent climate simulation data set demonstrates our approach and presents the results.

DriftWeed - A Visual Metaphor for Interactive Analysis of Multivariate Data

Rose S and PC Wong. 2000. DriftWeed - A Visual Metaphor for Interactive Analysis of Multivariate Data. Proceedings IS&T/SPIE Conference on Visual Data Exploration and Analysis, San Jose, CA, Jan 2000.

Abstract

We present a visualization technique that allows a user to identify and detect patterns and structures within a multivariate data set. Our research builds on previous efforts to represent multivariate data in a two-dimensional information display through the use of icon plots. Although the icon plot work done by Pickett and Grinstein is similar to our approach, we improve on their efforts in several ways.

Our technique allows analysis of a time series without using animation; promotes visual differentiation of information clusters based on measures of variance; and facilitates exploration through direct manipulation of geometry based on scales of variance.

Our goal is to provide a visualization that implicitly conveys the degree to which an element's ordered collection (pattern) of attributes varies from the prevailing pattern of attributes for other elements in the collection. We apply this technique to multivariate abstract data and use it to locate exceptional elements in a data set and divisions among clusters.

ThemeRiver: Visualizing Theme Changes over Time

Havre S, B Hetzler, and L Nowell. 2000. "ThemeRiver: Visualizing Theme Changes over Time", Proceedings of IEEE Symposium on Information Visualization, InfoVis 2000, pp. 115 - 123.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time within a large collection of documents. The "river" flows from left to right through time, changing width to depict changes in thematic strength of temporally associated documents. Colored "currents" flowing within the river narrow or widen to indicate decreases or increases in the strength of an individual topic or a group of topics in the associated documents. The river is shown within the context of a timeline and a corresponding textual presentation of external events.

Vector Fields Simplification - A Case Study of Visualizing Climate Modeling and Simulation Data Sets

Wong PC, H Foote, R Leung, E Jurrus, D Adams, and J Thomas. 2000. Vector Fields Simplification - A Case Study of Visualizing Climate Modeling and Simulation Data Sets. Proceedings IEEE Visualization 2000. Salt Lake City, Utah, Oct 8 - Oct 13, 2000.

Abstract

In our study of regional climate modeling and simulation, we frequently encounter vector fields that are crowded with large numbers of critical points. A critical point in a flow is where the vector field vanishes. While these critical points accurately reflect the topology of the vector fields, in our study only a subset of them is worth further investigation. We present a filtering technique based on the vorticity of the vector fields to eliminate the less interesting and sometimes sporadic critical points in a multi-resolution fashion. The neighboring regions of the preserved features, which are characterized by strong shear and circulation, are potential locations of weather instability. We apply our feature- filtering technique to a regional climate modeling data set covering East Asia in the summer of 1991.

Visualizing Sequential Patterns for Text Mining

Wong PC, W Cowley, H Foote, E Jurrus, and J Thomas. 2000. Visualizing Sequential Patterns for Text Mining. Proceedings IEEE Information Visualization 2000, Salt Lake City, Utah, Oct 8 - Oct 13, 2000.

Abstract

A sequential pattern in data mining is a finite series of elements such as A→B→C→D where A, B, C, and D are elements of the same domain. The mining of sequential patterns is designed to find patterns of discrete events that frequently happen in the same arrangement along a timeline. Like association and clustering, the mining of sequential patterns is among the most popular knowledge discovery techniques that apply statistical measures to extract useful information from large datasets. As our computers become more powerful, we are able to mine bigger datasets and obtain hundreds of thousands of sequential patterns in full detail. With this vast amount of data, we argue that neither data mining nor visualization by itself can manage the information and reflect the knowledge effectively. Subsequently, we apply visualization to augment data mining in a study of sequential patterns in large text corpora. The result shows that we can learn more and more quickly in an integrated visual data-mining environment.

Show all abstracts

1999

Visual Data Mining - Guest Editor's Introduction

Wong PC. 1999. Visual Data Mining - Guest Editor's Introduction. IEEE Computer Graphics and Applications, Vol 19, No 5, Sep 1999.

Abstract

Seeing is knowing, though merely seeing is not enough. When you understand what you see, seeing becomes believing. A while ago scientists discovered that seeing and understanding together enable humans to glean knowledge and deeper insight from large amounts of data. The approach integrates the human mind's exploration abilities with the enormous processing power of computers to form a powerful knowledge discovery environment that capitalizes on the best of both worlds. The technology builds on visual and analytical processes developed in various disciplines including scientific visualization, data mining, statistics, and machine learning with custom extensions that handle very large, multidimensional, multivariate data sets. The methodology is based on both functionality that characterizes structures and displays data and human capabilities that perceive patterns, exceptions, trends, and relationships. Here I'll define the vision, present the state of the art, and discuss the future of a young discipline called visual data mining.

Visualizing Association Rules for Text Mining

Wong PC, P Whitney, and J Thomas. 1999. Visualizing Association Rules for Text Mining. Proceedings IEEE Information Visualization 99, San Francisco, CA, Oct 24 - Oct 29, 1999.

Abstract

An association rule in data mining is an implication of the form X Y where X is a set of antecedent items and Y is the consequent item. For years researchers have developed many tools to visualize association rules. However, few of these tools can handle more than dozens of rules, and none of them can effectively manage rules with multiple antecedents. Thus, it is extremely difficult to visualize and understand the association information of a large data set even when all the rules are available. This paper presents a novel visualization technique to tackle many of these problems. We apply the technology to a text mining study on large corpora. The results indicate that our design can easily handle hundreds of multiple antecedent association rules in a three-dimensional display with minimum human interaction, low occlusion percentage, and no screen swapping.

ThemeRiver™: In Search of Trends, Patterns, and Relationships

Havre S, B Hetzler, and L Nowell. 1999. ThemeRiver™: In Search of Trends, Patterns, and Relationships. In Proceedings of IEEE Symposium on Information Visualization, InfoVis '99, October 25-26, San Francisco CA.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time across a collection of documents. The "river" flows through time, changing width to depict changes in the thematic strength of documents temporally collocated. Themes or topics are represented as colored "currents" flowing within the river that narrow or widen to indicate decreases or increases in the strength of a topic in associated documents at a specific point in time. The river is shown within the context of a timeline and a corresponding textual presentation of external events.

Human Computer Interaction with Global Information Spaces - Beyond Data Mining

Thomas J, K Cook, V Crow, B Hetzler, R May, D McQuerry, R McVeety, N Miller, G Nakamura, L Nowell, P Whitney, and PC Wong. 1999. Human Computer Interaction with Global Information Spaces - Beyond Data Mining. Pacific Northwest National Laboratory, Richland, WA 99352

Abstract

This invited paper describes a vision and progress towards a fundamentally new approach for dealing with the massive information overload situation of the emerging global information age. Today we use techniques such as data mining, through a WIMP interface, for searching or for analysis. Yet, the human mind can deal and interact simultaneously with millions of information items, e.g. documents. The challenge is to find visual paradigms, interaction techniques, and physical devices that encourage a new human information discourse between the human and their massive global and corporate information resources. After the vision, the current progress towards some core technology development, we present the grand challenges to bring this vision to reality.

Show all abstracts

1998

TOPIC ISLANDS™ - A Wavelet-Based Text Visualization System

Miller NE, PC Wong, M Brewster, and H Foote. 1998. TOPIC ISLANDS™ - A Wavelet Based Text Visualization System. In Proceedings of the conference on Visualization '98, pp. 189-196.

Abstract

We present a novel approach to visualize and explore unstructured text. The underlying technology, called TOPIC-O-GRAPHY™, applies wavelet transforms to a custom digital signal constructed from words within a document. The resultant multiresolution wavelet energy is used to analyze the characteristics of the narrative flow in the frequency domain, such as theme changes, which is then to the overall thematic content of the text document using statistical methods. The thematic characteristics of a document can be analyzed at varying degrees of detail, ranging from section-sized text partitions to partitions consisting of a few words. Using this technology, we are developing a visualization system prototype known as TOPIC ISLANDS™ to browse a document, generate fuzzy document outlines, summarize text by levels of detail and according to user interests, define meaningful subdocuments, query text content, and provide summaries of topic evolution.

Four Critical Elements for Designing Information Exploration Systems.

Hetzler B and N Miller. 1998. Four Critical Elements for Designing Information Exploration Systems. Presented at Information Exploration workshop for ACM SIGCHI '98. Los Angeles, CA. April 1998. PNNL-SA-29745

Abstract

Designing an information exploration system requires attention to four critical components. Since information exploration is a highly interactive process, the user is a key element. The second and third critical elements are the presentation methods that are used to communicate information and the interaction techniques that enable that user to actively explore that information. Finally, powerful mathematics are needed to identify and manipulate features of the information. This paper describes how these four critical components can work together to flexibly meet varied user goals.

Visualizing the Full Spectrum of Document Relationships

Hetzler B, WM Harris, S Havre , and P Whitney. 1998.Visualizing the Full Spectrum of Document Relationships. In Structures and Relations in Knowledge Organization. Proc. 5th Int. ISKO Conf. Wurzburg: ERGON Verlag, pp. 168-175.

Abstract

Documents embody a rich and potentially very useful set of complex interrelationships, both among the documents themselves and among the terms they contain. However, the very richness of these relationships and the variety of potential applications make it difficult to present them in a usable form. This paper describes an approach that enables the user to visualize a multitude of document or entity relationships. Two visual metaphors are presented that allow the user to gain new insights and understandings by interactively exploring these relationship patterns at multiple levels of detail.

Multi-faceted Insight Through Interoperable Visual Information Analysis Paradigms.

Hetzler B, P Whitney , L Martucci , and J Thomas. 1998. Multi-faceted Insight Through Interoperable Visual Information Analysis Paradigms. In Proceedings of IEEE Symposium on Information Visualization, InfoVis '98, October 19-20, 1998, Research Triangle Park, North Carolina. pp.137-144.

Abstract

To gain insight and understanding of complex information collections, users must be able to visualize and explore many facets of the information. This paper presents several novel visual methods from an information analyst's perspective. We present a sample scenario, using the various methods to gain a variety of insights from a large information collection. We conclude that no single paradigm or visual method is sufficient for many analytical tasks. Often a suite of integrated methods offers a better analytic environment in today's emerging culture of information overload and rapidly changing issues. We also conclude that the interactions among these visual paradigms are equally as important as, if not more important than, the paradigms themselves.

Show all abstracts

1997

Beyond Word Relations - SIGIR '97

Hetzler, E. 1997. Beyond Word Relations. SIGIR Forum, Fall 1997. Vol 31, No. 2. ACM Press, p. 28-32.

Abstract

Many information retrieval systems identify documents or provide a document visualization based on analysis of a particular relationship among documents — that of similar topical content. But there may be layers of other less apparent and less traditional relationships that are useful to the user. Exploring this other information was the subject of this workshop, with a focus on identifying new non-traditional relationships. An initial taxonomy was introduced and fleshed out during the workshop.

The Need For Metrics In Visual Information Analysis

Miller NE, G Nakamoto, B Hetzler , and P Whitney. 1997. The Need For Metrics In Visual Information Analysis. Workshop on New Paradigms in Information Visualization and Manipulation in conjunction with the Sixth ACM International Conference on Information and Knowledge Management (CIKM '97), November 13-14, 1997, Las Vegas Nevada, ACM Press

Abstract

This paper explores several methods for visualizing the thematic content of large document collections. As opposed to traditional query-driven document retrieval, these methods are used for exploring and gaining insight into document collections. For our experiments, we used 12,000 medical abstracts. The SPIRE [now IN-SPIRE] system was used to create the mathematical signal from text and to project the documents into a universe of "docustars" and as a thematic contour map based on thematic proximity. A self-organizing map is used to project the documents onto a "Tree" fractal. A topic-based approach is used to align documents between concepts in the "Cosmic Tumbleweed" projection. In the 32-D Hypercube, documents are organized by cascading theme strengths. An argument is made for a new type of metric that would facilitate comparisons among the many methods for visualizing or browsing document collections. An initial organization is proposed for some of the relevant research that metrics for information visualization can draw upon.

The STARLIGHT Information Visualization System

Risch, J.S., Rex, D.B., Dowson, S.T., Walters, T.B., May, R.A., and Moon, B.D., 1997, The STARLIGHT Information Visualization System, In: Proceedings of the 1997 IEEE Internal Information Visualization Conference (IV '97), August 27-29, 1997, London England.

Information Visualization

Core Areas

Related Resources

Select a Year

Illuminating the Path
Illuminating the Path: The Research and Development Agenda for Visual Analytics. Download