Geographic bubble charts are a way to visualize data overlaid on a map. For data with geographic characteristics, these charts can provide much-needed context. In this example, you import a file into MATLAB® as a table and create a geographic bubble chart from the table variables (columns). Then you work with the data in the table to visualize aspects of the data, such as population size.
Load the sample file counties.xlsx
, which contains records of population and Lyme disease occurrences by county in New England. Read the data into a table using readtable
.
counties = readtable('counties.xlsx');
Create a geographic bubble chart that shows the locations of counties in New England. Specify the table as the first argument, counties
. The geographic bubble chart stores the table in its SourceTable
property. The example displays the first five rows of the table. Use the 'Latitude'
and 'Longitude'
columns of the table to specify locations. The chart automatically sets the latitude and longitude limits of the underlying map, called the basemap, to include only those areas represented by the data. Assign the GeographicBubbleChart
object to the variable gb
. Use gb
to modify the chart after it is created.
figure gb = geobubble(counties,'Latitude','Longitude');
head(gb.SourceTable, 5)
ans=5×19 table
FIPS ANSICODE Latitude Longitude CountyName State StateName Population2010 HousingUnits2010 LandArea WaterArea Cases2010 Cases2011 Cases2012 Cases2013 Cases2014 Cases2015 Cases2014_1 Cases2015_1
____ __________ ________ _________ _____________________ ______ _______________ ______________ ________________ __________ __________ _________ _________ _________ _________ _________ _________ ___________ ___________
9001 2.1279e+05 41.228 -73.367 {'Fairfield County' } {'CT'} {'Connecticut'} 9.1683e+05 3.6122e+05 1.6185e+09 5.4916e+08 331 305 225 443 437 427 437 427
9003 2.1234e+05 41.806 -72.733 {'Hartford County' } {'CT'} {'Connecticut'} 8.9401e+05 3.7425e+05 1.9039e+09 4.0213e+07 187 167 143 288 291 335 291 335
9005 2.128e+05 41.792 -73.235 {'Litchfield County'} {'CT'} {'Connecticut'} 1.8993e+05 87550 2.3842e+09 6.2166e+07 88 118 67 187 168 202 168 202
9007 2.128e+05 41.435 -72.524 {'Middlesex County' } {'CT'} {'Connecticut'} 1.6568e+05 74837 9.5649e+08 1.8068e+08 125 109 93 181 155 241 155 241
9009 2.128e+05 41.35 -72.9 {'New Haven County' } {'CT'} {'Connecticut'} 8.6248e+05 3.62e+05 1.5657e+09 6.6705e+08 240 249 213 388 459 474 459 474
You can pan and zoom in and out on the basemap displayed by the geobubble
function.
Use bubble size (diameter) to indicate the relative populations of the different counties. Specify the Population2010
variable in the table as the value of the SizeVariable
parameter. In the resultant geographic bubble chart, the bubbles have different sizes to indicate population. The chart includes a legend that describes how diameter expresses size. Adjust the limits of the chart using geolimits
.
gb = geobubble(counties,'Latitude','Longitude',... 'SizeVariable','Population2010'); geolimits([39.50 47.17],[-74.94 -65.40])
geobubble
scales the bubble diameters linearly between the values specified by the SizeLimits
property.
Use bubble color to show the number of Lyme disease cases in a county for a given year. To display this type of data, the geobubble
function requires that the data be a categorical
value. Initially, none of the columns in the table are categorical but you can create one. For example, you can use the discretize
function to create a categorical variable from the data in the Cases2010
variable. The new variable, named Severity
, groups the data into three categories: Low, Medium, and High. Use this new variable as the ColorVariable
parameter. These changes modify the table stored in the SourceTable
property, which is a copy of the original table in the workspace, counties
. Making changes to the table stored in the GeographicBubbleChart
object avoids affecting the original data.
gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 500],... 'categorical', {'Low', 'Medium', 'High'}); gb.ColorVariable = 'Severity';
When you plot the severity information, a fourth category appears in the color legend: undefined
. This category can appear when the data you cast to categorical
contains empty values or values that are out of scope for the categories you defined. Determine the cause of the undefined Severity
value by hovering your cursor over the undefined bubble. The data tip shows that the bubble represents values in the 33rd row of the Lyme disease table.
Check the value of the variable used for Severity, Cases2010, which is the 12th variable in the 33rd row of the Lyme disease table.
gb.SourceTable(33,12)
ans=table
Cases2010
_________
514
The High
category is defined as values between 100 and 500. However, the value of the Cases2010 variable is 514. To eliminate this undefined value, reset the upper limit of the High category to include this value. For example, use 5000.
gb.SourceTable.Severity = discretize(counties.Cases2010,[0 50 100 5000],... 'categorical', {'Low', 'Medium', 'High'});
Unlike the color variable, when geobubble
encounters an undefined number (NaN) in the size, latitude, or longitude variables, it ignores the value.
Use a color gradient to represent the Low-Medium-High categorization. geobubble
stores the colors as an m-by-3 list of RGB values in the BubbleColorList
property.
gb.BubbleColorList = autumn(3);
Change the color indicating high severity to be red rather than yellow. To change the color order, you can change the ordering of either the categories or the colors listed in the BubbleColorList
property. For example, initially the categories are ordered Low-Medium-High. Use the reordercats
function to change the categories to High-Medium-Low. The categories change in the color legend.
neworder = {'High','Medium','Low'}; gb.SourceTable.Severity = reordercats(gb.SourceTable.Severity,neworder);
When you display a geographic bubble chart with size and color variables, the chart displays a size legend and color legend to indicate what the relative sizes and colors mean. When you specify a table as an argument, geobubble
automatically uses the table variable names as legend titles, but you can specify other titles using properties.
title 'Lyme Disease in New England, 2010' gb.SizeLegendTitle = 'County Population'; gb.ColorLegendTitle = 'Lyme Disease Severity';
Looking at the Lyme disease data, the trend appears to be that more cases occur in more densely populated areas. Looking at locations with the most cases per capita might be more interesting. Calculate the cases per 1000 people and display it on the chart.
gb.SourceTable.CasesPer1000 = gb.SourceTable.Cases2010 ./ gb.SourceTable.Population2010 * 1000; gb.SizeVariable = 'CasesPer1000'; gb.SizeLegendTitle = 'Cases Per 1000';
The bubble sizes now tell a different story than before. The areas with the largest populations tracked relatively well with the different severity levels. However, when looking at the number of cases normalized by population, it appears that the highest risk per capita has a different geographic distribution.
categorical
| discretize
| geobubble
| GeographicBubbleChart Properties | readtable
| reordercats
| table