Create viz story in Tableau – example¶
Data Courtesy of¶
Rebecca Rose and Susanna Lamers @ bioinfox.com
Original data source:¶
|All sample data:|
|SequenceID columns is “Sequence Name”|
|All couples in R13:|
|Each SequenceID only listed once, so 91 couples means total 182 SequenceID are with couple.|
|clusters:||Five cluster files, each for one cluster.|
Processed data source¶
- read the couples data, for each couple (two IDs), add columns for their ClusterID.
- calculate the Catergory. The “Category” column has three values: IN, OUT, INOUT.
- IN – If the couple in same cluster
- OUT – If the couple not in cluster
- INOUT – If only one of the couple in cluster
- Same as the couples_checkcls.csv file, but drop the ClusterID columns.
- Create a new dataframe with reversed c1 and c2 columns, leave the Category columns unchanged. Then concatenate two dataframe. This is to make sure columne “c1id” covers all IDs with couple.
- read all sample data, filter out R13 subset
- read cluster datasets, add clusterID columns for each SequenceID
- original data columns
- plus the new clusterID columns: ClusterID_01,ClusterID_02,ClusterID_03,ClusterID_04,ClusterID_05,ClusterID_053
Here the clusterID columns replaces the dropped clusterID columns in couples_category.csv file.
Created data source for mapping¶
Open the Regions image in Illustrator, use the pen tool and trace a path for each region area. Click each path, from menu Window->Attributes, set “image map” to “polygon”, and “URL” an identification you want, e.g. “#R7”. Then from menu File->Save for Web and Devices, save the image map as web html file. Open the html file in a text editor, there you can find a list of control points for each path identification (URL). Add the point list for each Region to your csv file, with incrementing PointOrder (start from 1)
|Left join:||GP41_R13_with_cluster.csv (Sequence Name) with GP41_R13_couples_category.csv (c1id), data type: String|
|Inner join:||GP41_R13_with_cluster.csv (Region) with region_polygon.csv (Region), data type: Number (Whole)|
Calculated fields in Tableau¶
|coupleStatus:||IF hasCouple THEN “Couple” ELSE “NoCouple”|
|IF coupleInCluster THEN “IN” ELSE “OUT”|
Worksheets and Dashboards:¶
|Cluster tree map:|
color by “number of records”
|Regions filled map:|
plot AVG(X) and AVG(Y), turn “Region” to dimension and set as detail. path by “PointOrder”, color by “number of records”. Also set map as background image.
|Subtype Pie chart:|
color by “Subtype”, angle by “number of records”
|Subtype bar chart:|
color by “Subtype”, size by “number of records”
|hasCouple pie chart:|
color by “hasCouple”, angle by “number of records”
|coupleInCluster pie chart:|
color by “coupleInCluster”, angle by “number of records”
|dual axis charts:|
Answer the questions:
- What’s the most affected Region, and what’s the least?
- Explore subType distribution among regions
- For each cluster, do the subtype distribution change?
- For each cluster, how many couples get into the same cluster?