Working with Multiple Views

Step 1: Introduction

This tutorial demonstrates how to manipulate views in Genome Workbench. After completing this, you will know how to create new views, create workspace splitters and tabs, and move views between different locations. You will also learn how views communicate between each other, and learn about the ways that Genome Workbench can let you see selections in one view reflected in other views.

In order to get the full benefit of this tutorial, you will need to download the sample data set for the Barcode project.

You should complete Tutorial 1: Basic Operation first.

Step 2: Get the tools and sample data

The sample data set contains a project that references a set of sequences for the Collembola species. These sequences are all examples of cytochrome oxidase from several closely related organisms. In addition, the project contains a protein multiple alignment generated using MUSCLE, and a phylogenetic tree reconstructed from this alignment.

MUSCLE is not distributed with Genome Workbench but it is found here:

The sample data can be obtained directly from:

Step 3: Opening a Multiple Alignment View

Let's start by opening the example workspace you downloaded. Choose the open folder icon from the main toolbar, or choose File->Open from the menu bar. Choose Projects or Workspace from the left side of the dialog and click the button with the ... on the right side. Then navigate to the barcode.gbw file you downloaded and click Open. Then click Finish.

 Next, open a multiple alignment view on the protein alignment. You can do this by selecting the multiple alignment in the project tree (the MUSCLE alignment), right-clicking and choosing Open View. Another easier way to get there is by double-clicking on the item; this will bring up the Open View dialog, as shown at the right. Or you can choose View->Open View from the main menu.Select New View

Select Multiple Alignment View and click next.

Step 4: Multiple Alignment View Features

The default view for the multiple alignment will appear. You will see an image like the one below.

Column Settings

There are several features to note:

  • The header row contains a set of column headings. The set of columns visible is up to you to decide; you can rearrange the columns using drag-and-drop, and a right-click on the header and choose Settings to bring up a menu to select additional columns.
  • As with the graphical view, you can zoom interactively on this view by using Z + Left Click/Drag for the interactive zoom slider, or R + Left Click/Drag for a range zoom.
  • A tooltip will appear if you hover over a location. This tooltip tells you which sequence you are hovering on, as well as the position on that sequence (in both sequence coordinate space and alignment coordinate space).

Multiple Alignment Default

Step 5: Coloration Schemes

Now, we'll look at alternate ways to score and color an alignment. Genome Workbench provides a variety of means for scoring alignments. If you right-click in the alignment view, you will see a context menu like the one at the right. Choose Coloration -> Select Method... to see the list of available schemes for coloring proteins.

The default coloration scheme is called Frequency-Based Differences. This method assigns a score to each residue in a column in an alignment; the color chosen is based on the inverse frequency in a column. Thus, rare changes in a column receive dark coloration, while common variations appear light. In addition, since this is based on frequency, if there is substantial difference somewhere in the column, the entire column will be shaded. This extra shading allows you to see locations of mismatches in alignments without needing to see the actual mismatches.

Coloration Methods Protein

Let's change the method to one that scores based on a hydropathy scale. This method scores each residue in an amino acid independently and provides a colorimetric scale between hydrophobic (red) and hydrophilic (blue). Click on Hydropathy Scale and click Select. Once this is set, you should see the multiple alignment view change to look like the image below.

Coloration applied

Step 6: More About Coloration Schemes

Each coloration method offers its own configuration settings. While many of these settings are not ones that most people would want to change, some of these are notable, so let's look at how to change these. Right-click in the multiple alignment view and choose Coloration -> Method Properties.... You should see the menu as in the image on the right.

Choosing this brings up a properties dialog for the coloration scheme. There are several things to note here:

  • The colors used are configurable. The default for the hydropathy scale is red for hydrophobic and blue for hydrophilic. In addition, the color used for neutral is provided. You may change each of these colors. In addition, there is a slider above the color scale so you can select the degree of gradation between colors.
  • There is a check box to toggle consensus scoring. Toggling this changes the calculation of score so that coloration is based on the difference from the average score in a column, rather than on the single score provided for the amino acid. Choosing this allows you to investigate variance within a column. Please check Use consensus now and click OK. You should see the screen change to match the image below.
  • When using consensus scoring, you can choose to provide a window for averaging across an alignment. The default is 1 - that is, consider the average score for a column to consist of the averages of all residues in that column. You can change this to include adjacent columns as well. For hydropathy, this provides a means for identifying regions of the alignment that are more or less hydrophobic or hydrophilic than expected.

Step 7: Adding a Phylogenetic Tree View

Next, let's add a phylogenetic tree view. Select the Phylogenetic tree item in the Project Tree and create a new phylogenetic tree by opening the Open View dialog, as shown on the right and choosing Tree View. You should see a view like the one in the image below appear.

Phy Tree Build

This tree is a tree constructed from the alignment in this project. The tree was obtained by running the tool at Tools -> Run Tool and choosing Phylogenetic Tree Builder Tool and using the Neighbor Joining algorithm.

Step 8: Phylogenetic Tree View Features

The phylogenetic tree view offers a variety of ways to manipulate and edit trees. We'll discuss a few of these below.

Layout Options

Phylogenetic Tree Radial Phylogenetic Tree Radial Menu

The phylogenetic tree offers several different methods to lay out the nodes in the tree. One option is to use a Force Layout. Such a tree more adequately reflects the fact that the tree is itself unrooted, since this is a reconstruction. This mode is available in the contex menu under the Layout option as shown on the right. There are other layout options available, including:

  • Rectangular Cladogram - the default view
  • Slanted Cladogram - provides a triangular view of the tree
  • Radial Tree - shows the tree in radial format
  • Force Layout - shows the tree in radial format, using a physics-based approximation to adjust the layout

Searching

Phylogenetic Tree query

The phylogenetic tree offers a powerful search implemented via the search bar at the top of the window. Two search methods are implemented within the single interface:

  • Simple string matching
  • Full query search

Simple string matching allows you to type in some text and then press enter or Start to search for that text within all node properties in the current tree. If your text includes blanks, enclose it in quotes to force the search tool to use simple string matching. If the text in the search box has blank spaces and is not enclosed in quotes, the search engine will attempt to parse it according to the query language syntax.

Phylogenetic Tree Query Phylogenetic Tree 2 After a query is executed, the matching nodes replace any currently selected nodes. To enhance visualization of the results, click Hide Unselected on the toolbar which draws the nodes not selected by the query semi-transparently. To view the nodes one-at-a-time, click Select All and then Prev and Next go through the selected nodes individually.

Full query search allows you to create logical queries similar to how you select records in an SQL database. In this format, use the node properties and compare them with other properties or values of your choice. Queries can be built from a combination of comparisons, such as equal and greater-than, combined with logical operators such as AND and OR. Logical operators may be given in upper or lower case. While typing in a query, node property names will be highlighted in blue. To execute the query, press 'Enter' in the search box or click the Start button. While a query is running you will not be able to manipulate the tree. In the event that a query takes too long, click the Stop button to stop the query.

Phylogenetic Tree Query Phylogenetic Tree 2

The full query search syntax allows for a number of comparison and logical operators that you can combine with values you enter in the search string and properties from the current tree. The valid query elements include:

  • String, numeric and boolean values (such as 5, 0.2, true, "mitochondrial")
  • Node properties (such as seq-id, dist, organism, cluster-id)
  • Simple comparisons: <, <=, >, >=, =, !=
  • The 'Like' Comparison which allows wildcards: organism like Desoria*
  • 'Between' comparison: dist between 0.02 and 0.05
  • 'In' comparison: seq-id in (AAT66216, AAT66240.1)
  • Logical Operators: AND, OR, XOR, NOT

Some valid queries for the sample project are:

organism = "Archisotoma polaris" and seq-id = "AAT66228"
dist between 0.002 and 0.003 or seq-id==AAT66206
label like "AAT6619*" xor dist > 0.002
seq-id in (AAT66197, AAT66220, AAT66229)

Distances

Phylogenetic Tree use distance Phylogenetic Tree Distances As generated in Genome Workbench, the phylogenetic tree itself marks each node with a computed distance from the presumed root. This distance can be used to alter the rendering to show graphically how each node related based on its distance from the root. This option is available in the context menu by right-clicking and selecting Layout -> Use Distances, as shown on the right.

Distances can be used in the Rectangular Cladogram as well as in the Radial Tree.

Labeling

Phylogenetic tree setting new label Phylogenetic Tree Organisms

The phylogenetic tree, by default, displays the sequence identifier at each node. This can be changed by using the Settings option in the right-click context menu. When you select this, you see a tabbed dialog like the one on the right. Select the Labels heading to change the labels. This dialog contains some simple labeling options to select the accession or organism name. In addition, you can select Custom Labels to construct a label from the available properties in the tree. The example here uses the label

$(label) - $(organism)

to construct a label containing the organism name. The drop-down and Insert button on this page may be used to insert the properties without needing to know the syntax.

Once the labels are set, you will see the phylogenetic tree view change to match the image on the right. Each node is now marked with the sequence accession as well as the species name.

Node Markers

Phylogenetic Tree Markers

Sometimes it may be desirable to highlight individual nodes in the tree with a marker attribute that will change the color and size of the displayed node. Markers are added as a property in the "Node Properties" dialog. You can display this dialog by using the Properties option in the right-click context menu for the node.

Markers are created by adding a property with name "marker" to the properties list for a node. The marker value may include one or more colors and, optionally, a size parameter. The colors are specified as RGB values between 0 and 255 between square brackets, e.g. [64 0 128]. The numbers may be separated by commas and/or spaces. If a fourth value, commonly called the alpha channel, is given between the brackets, it is ignored. When multiple colors are given, the marker is divided evenly between the given colors, and looks much like a pie chart.

Examples

Red marker, default size

[255 0 0]

Marker that is 50% red and 50% green with large size

[0 255 0] [255 0 0] size=4

Subtree Boundaries

Phylogenetic Tree Bound force Phylogenetic Tree slanted cladogram Phylogenetic Tree cladogram

The phylogenetic tree supports adding a colored boundary to one or more subtrees. Boundaries are added as a property to the parent node of the subtree using the "Node Properties" dialog. You can display this dialog by using the Properties option in the right-click context menu for the node.

Boundaries are created by adding a property with name "$NODE_BOUNDED" to the properties list for a node. There are several parameters for a boundary including its shape, color, border width and whether or not the boundary should include text. It is also possible to define different boundary shapes for each of the different layout methods. Parameters other than the shape will remain the same for each layout method.

Parameters for the boundary regions are not case-sensitive. Colors are specified in the format [0..255, 0..255, 0..255, 0..255] for red, blue, green and, optionally, alpha. The numbers may be separated by spaces and/or commas. Parameters that require a value are specified in the form "parameter=x", and the possible values for 'x' are shown below. Boolean parameters can be 'true', 'yes', 'y', 'false', 'no', or 'n'. Parameters such as color and border that apply to more than one boundary shape will be applied to all applicable shapes.

Shape Parameters

The following parameters specify the shapes to be used for different layouts. If the same boundary shape is to be used for all layouts, specify only the 'Shape' parameter. To override the 'Shape' parameter for other layouts, specify the shape for that layout.

Shape={Rectangle, RoundedRectangle, Triangle}
RectCladogram={Rectangle, RoundedRectangle, Triangle}
SlantedCladogram={Rectangle, RoundedRectangle, Triangle}
Radial={Rectangle, RoundedRectangle, Triangle}
ForceLayout={Rectangle, RoundedRectangle, Triangle}

Appearance Parameters

These parameters apply to all the different shapes. The boundary color is specified as [r, g, b, a] without the 'keyword=' syntax and it can include an optional transparency, or alpha, value where 0 is fully transparent and 255 is fully opaque. The 'DrawEdge' parameter adds a 1-pixel border to the boundary. The edge color defaults to black but can be changed with the 'EdgeColor' parameter. 'Border' expands the overall shape by a specified number of pixels and 'Corner' rounds off the corners in RoundedRectangles and Triangles. Since rounding corners brings corners inward, it may be helpful to increase 'Border' to compensate. If 'IncludeText' is true, the boundary shape will be expanded to include node labels.

[0..255, 0..255, 0..255, 0..255]
Border=n
Corner=n
DrawEdge={true, false}
EdgeColor=[0..255, 0..255, 0..255, 0..255]
IncludeText={true, false}

Triangle Parameters

This last set of parameters applies only to triangles. If 'AxisAligned' is true then the shape is aligned with the nearest x or y axis. This defaults to 'true'. The 'TextBox' parameter forces the text of the bounded nodes to be placed in a square box rather than expanding the triangle to include the text. Lastly, 'TriOffset' is the distance behind the root node at which the triangle apex should be placed. It defaults to '40' units.

AxisAligned={true, false}
TextBox={true, false}
TriOffset=n

Examples

Green rectangle boundary for all layouts with text included but no border or edge.

[0 255 0 255] shape=Rectangle IncludeText=true

Red triangle that does not include a text box and has rounded corners and a black edge.

[255 0 0 128] Shape=Triangle corner=10 border=10 textbox=false 
drawedge=y AxisAligned=false
          

Blue rectangle with rounded corners for the rectangular cladogram layout, triangle for slanted cladogram and force layouts and rounded rectangle for radial layouts. Boundaries will not be expanded to include text. Corners will be rounded and a 10-pixel border will expand the boundary size.

[0 0 255 128] shape=RoundedRectangle SlantedCladogram=Triangle 
Radial=RoundedRectangle ForceLayout=Triangle drawedge=n corner=10 
border=10  textbox=false IncludeText=false  AxisAligned=false
          

Saving Images

If you need to save a screen capture of the current tree, select Save Images... from the File menu to bring up the Save Images dialog. The dialog allows you to save the tree as a single image, or to divide the image into equal-sized tiles (sub-images) and save those to a directory. When saving the images, you can, via Printing Guides, display cutting markers and names of adjacent image tiles in the image margins. This is useful for saving images that will be printed and then reassembled into a poster presentation.

Save Images 1

Save Images 3 Save Images 2 In the Save Images dialog, use the Partitions slider to subdivide the image into multiple sub-images, each of which will be saved to a separate file in the directory name given by Directory. The names of the image files are displayed on each tile and are a combination of File Name and the image's index given according to the numbering scheme in Numbering. Use the Image Size to specify the size of each individual image saved, and use proportions to set the width-to-height ratio to make images as small as possible or to force them to a standard (paper) size.

Loading Attributes

Load Attributes To update the properties of nodes in a tree from a file, right-click on the background to bring up the context menu, and then select Load Attributes. The Loading Attributes feature allows you to update the properties of nodes in a tree by loading them from a flat file. The attributes in the file can include both updates to existing nodes attributes as well as new attributes. The sequence identifier, seq-id, property is used as the key to match nodes in the file to nodes in the tree. This of course implies that the feature can't be used to directly update nodes that do not have a seq-id.

The file that provides the updates to the node properties has a well-defined format an example of which is shown below. The first line of the file must contain the file-identifier:

#BKBTA-1
          

The next line must specify the names of all the node properties that are given in the file. The list of property names should start with # and the individual properties should be separated by spaces. The first property has to be seq-id since this is used to match entries in the file with nodes in the tree.

#seq-id cluster-id label dist
          

After these two lines, the following lines contain the actual node identifiers and properties to update. Additionally, any lines after the first two lines that start with # are read as comments and will be ignored. The list of properties for each node must be separated by tabs, not spaces.

AAT66197  2 Hypogastrura concolor 0.02
          

This is an example that provides a cluster-id for s set of nodes in the sample barcode project:

#BKBTA-1
            #seq-id cluster-id 
            #Add a cluster id to the tree
            AAT66197	2
            AAT66196	2
            AAT66189	2
            AAT66223	2
            AAT66236	2
            AAT66216	2
            AAT66230	2
            AAT66203	2
            AAT66195	2
          

Step 9: Arranging Windows

One of the powerful features in Genome Workbench is the ability to move the views where you'd like them , create tabbed stacks and resize any view. Our goal here is to take the search view and the selection inspector and dock them with the tab group on the bottom left. Then resize the selection inspector and use it to inspect the nodes in our Phylogenetic Tree.

Click on the Search view tab and drag it over the title bar in the bottom left view. As you drag, you'll see the dock icons appear giving you choices where to put the view. Choose the center icon when over the bottom left view. See right.

Do the same thing with the Selection Inspector. Then resize the bottom panel by moving the mouse over the divider and when it changes to a double arrow, click and drag. All the frames are resizable using the same technique.

Feel free to experiment with moving, docking and undocking, and resizing windows to find the set up that works for you.

Then go ahead an click on a node on our phlyogenetic tree and the selection inspector will show the item dynamically. The item displayed change based on where you click in the tree.

Once this is completed, you should have a view that looks like the image below.

Step 10: Interactions Between Views

So why would you want to go to the trouble of arranging views like this? The primary reason to do this is to see several aspects of the same data simultaneously. Genome Workbench provides this ability. To see it in action, open a Multiple Alignment View on the MUSCLE alignment in the Project Tree View.Multiple Alignment View

Dock the Multiple Alignment View on the bottom of the gBench window like we did in the previous step. Your view should like the view on the right.

Show Only Selection ChosenShow Only SelectedIf you click on a node in the Phylogenetic Tree view, the corresponding rows will highlight in the Multiple Alignment View. There can be many rows in the Multiple Alignment View so there are two ways to see the relevant rows.

The first way is to right-click (or control-click) on a description in the Multiple Alignment View and select Hide/Show -> Show Only Selected from the contextual menu.Move Selected ItemsMove Selected Items Chosen

The second way is to right-click (or control-click) on a description in the Multiple Alignment View and select Move Selected Items Up from the contextual menu.

You can reverse this operation at any time by right clicking and selecting Hide/Show -> Show All.

Step 11: Finished

This completes this tutorial. In this tutorial, we covered:

  • How to create different kinds of views on your data (Multiple Alignment View, Phylogenetic Tree View, Selection Inspector)
  • How to use scoring and coloration schemes in the multiple alignment view to see differences in your data.
  • How to manipulate the phylogenetic tree view to provide more informative displays.
  • How to arrange views to provide several different views of the same data on the screen at once.
  • How to see selections shown between different views.

30-zoom-to-sequence

Last updated: 2012-08-27T13:09:11-04:00