groupcounts

Number of group elements

Description

example

G = groupcounts(T,groupvars) computes the number of elements in each group of data in a table or timetable, and returns a table containing the groups and their counts. A group is the set of unique combinations of grouping variables in groupvars. For example, G = groupcounts(T,'Gender') returns the number of Male elements and the number of Female elements in the variable Gender.

example

G = groupcounts(T,groupvars,groupbins) specifies how to bin the data in groupvars. For example, G = groupcounts(T,'SaleDate','year') gives the yearly sales counts according to SaleDate.

example

G = groupcounts(___,Name,Value) specifies additional grouping properties using one or more name-value pairs for any of the previous syntaxes. For example, G = groupcounts(T,'Category1','IncludeMissingGroups',false) excludes the group made from missing categorical data indicated by <undefined>.

B = groupcounts(A) returns a vector containing the group counts for each unique combination of grouping vectors represented in a column vector, matrix, or cell array of column vectors A.

B = groupcounts(A,groupbins) bins the data according to groupbins.

B = groupcounts(___,Name,Value) specifies additional grouping properties using one or more name-value pairs.

example

[B,BG] = groupcounts(A,___) also returns the groups corresponding to the counts in B.

Examples

collapse all

Compute the number of group elements from table data.

Create a table containing information about five individuals.

Gender = ["male";"female";"male";"female";"male"];
Smoker = logical([1;0;1;0;1]);
Weight = [176;163;131;133;119];
T = table(Gender,Smoker,Weight)
T=5×3 table
     Gender     Smoker    Weight
    ________    ______    ______

    "male"      true       176  
    "female"    false      163  
    "male"      true       131  
    "female"    false      133  
    "male"      true       119  

Count the number of elements in each group by gender.

G1 = groupcounts(T,'Gender')
G1=2×2 table
     Gender     GroupCount
    ________    __________

    "female"        2     
    "male"          3     

Count the number of elements in each group by gender and smoker status. By default, groupcounts suppresses groups with zero elements.

G2 = groupcounts(T,{'Gender','Smoker'})
G2=2×3 table
     Gender     Smoker    GroupCount
    ________    ______    __________

    "female"    false         2     
    "male"      true          3     

To count all groups, including those with zero elements, specify the 'IncludeEmptyGroups' parameter with value true.

G3 = groupcounts(T,{'Gender','Smoker'},'IncludeEmptyGroups',true)
G3=4×3 table
     Gender     Smoker    GroupCount
    ________    ______    __________

    "female"    false         2     
    "female"    true          0     
    "male"      false         0     
    "male"      true          3     

Group data according to specified bins.

Create a timetable containing sales information for days within a single month.

TimeStamps = datetime([2017 3 4; 2017 3 2; 2017 3 15; 2017 3 10;...
                       2017 3 14; 2017 3 31; 2017 3 25;...
                       2017 3 29; 2017 3 21; 2017 3 18]);
Profit = [2032 3071 1185 2587 1998 2899 3112 909 2619 3085]';
TotalItemsSold = [14 13 8 5 10 16 8 6 7 11]';
TT = timetable(TimeStamps,Profit,TotalItemsSold)
TT=10×2 timetable
    TimeStamps     Profit    TotalItemsSold
    ___________    ______    ______________

    04-Mar-2017     2032           14      
    02-Mar-2017     3071           13      
    15-Mar-2017     1185            8      
    10-Mar-2017     2587            5      
    14-Mar-2017     1998           10      
    31-Mar-2017     2899           16      
    25-Mar-2017     3112            8      
    29-Mar-2017      909            6      
    21-Mar-2017     2619            7      
    18-Mar-2017     3085           11      

Compute the group counts by the total items sold, binning the groups into intervals of item numbers.

G = groupcounts(TT,'TotalItemsSold',[0 4 8 12 16])
G=3×2 table
    disc_TotalItemsSold    GroupCount
    ___________________    __________

         [4, 8)                3     
         [8, 12)               4     
         [12, 16]              3     

Compute the group counts grouped by day of the week.

G = groupcounts(TT,'TimeStamps','dayname')
G=5×2 table
    dayname_TimeStamps    GroupCount
    __________________    __________

        Tuesday               2     
        Wednesday             2     
        Thursday              1     
        Friday                2     
        Saturday              3     

Compute the group counts for four groups based on their gender and smoker status.

Store patient information as three vectors of different types.

Gender = ["male";"female";"male";"female";"male"];
Smoker = logical([1;0;1;0;1]);
Weight = [176;163;131;133;119];

Grouping by gender and smoker status, compute the group counts. B contains the counts for each group. BG is a cell array containing two vectors that describe the groups as you look at their elements rowwise. For instance, the first row of BG{1} says that the patients in the first group are female, and the first row of BG{2} says that they are nonsmokers. The count for that group is 2, found in the corresponding row of B.

[B,BG] = groupcounts({Gender,Smoker},'IncludeEmptyGroups',true);
B
B = 4×1

     2
     0
     0
     3

BG{1}
ans = 4x1 string
    "female"
    "female"
    "male"
    "male"

BG{2}
ans = 4x1 logical array

   0
   1
   0
   1

Input Arguments

collapse all

Input data, specified as a table or timetable.

Input array, specified as a column vector, matrix, or cell array of column vectors representing grouping vectors. When A is a matrix, the grouping vectors are columnwise.

Grouping variables for table input, specified as a scalar, vector, matrix, cell array, function handle, or table vartype subscript.

groupvars indicates which columns of the input table to use to compute the groups, and can be one of the following:

  • A character vector or scalar string specifying a single table variable name

  • A cell array of character vectors or a string array, where each element is a table variable name

  • A vector of table variable indices

  • A logical vector whose elements each correspond to a table variable, where true includes the corresponding variable and false excludes it

  • A function handle that takes a table variable as input and returns a logical scalar

  • A table vartype subscript

Example: 'Age'

Example: {'Height','Weight'}

Example: vartype('numeric')

Binning scheme, specified as one of the following options:

  • 'none', indicating the groups are returned according to the specified grouping variables only

  • A list of bin edges, specified as a numeric vector, or a datetime vector for datetime grouping variables or vectors

  • A number of bins, specified as an integer scalar

  • A time duration, specified as a scalar of type duration or calendarDuration, indicating bin widths (for datetime or duration grouping variables or vectors only)

  • A cell array listing binning rules for each grouping variable or vector

  • A time bin for datetime and duration grouping variables or vectors only, specified as one of the following character vectors:

    ValueDescriptionData Type
    'second'

    Each bin is 1 second.

    datetime and duration
    'minute'

    Each bin is 1 minute.

    datetime and duration
    'hour'

    Each bin is 1 hour.

    datetime and duration
    'day'

    Each bin is 1 calendar day. This value accounts for Daylight Saving Time shifts.

    datetime and duration
    'week'Each bin is 1 calendar week.datetime only
    'month'Each bin is 1 calendar month.datetime only
    'quarter'Each bin is 1 calendar quarter.datetime only
    'year'

    Each bin is 1 calendar year. This value accounts for leap days.

    datetime and duration
    'decade'Each bin is 1 decade (10 calendar years).datetime only
    'century'Each bin is 1 century (100 calendar years).datetime only
    'secondofminute'

    Bins are seconds from 0 to 59.

    datetime only
    'minuteofhour'

    Bins are minutes from 0 to 59.

    datetime only
    'hourofday'

    Bins are hours from 0 to 23.

    datetime only
    'dayofweek'

    Bins are days from 1 to 7. The first day of the week is Sunday.

    datetime only
    'dayname'Bins are full day names such as 'Sunday'.datetime only
    'dayofmonth'Bins are days from 1 to 31.datetime only
    'dayofyear'Bins are days from 1 to 366.datetime only
    'weekofmonth'Bins are weeks from 1 to 6.datetime only
    'weekofyear'Bins are weeks from 1 to 54.datetime only
    'monthname'Bins are full month names such as 'January'.datetime only
    'monthofyear'

    Bins are months from 1 to 12.

    datetime only
    'quarterofyear'Bins are quarters from 1 to 4.datetime only

When multiple grouping variables or vectors are specified, you can provide a single binning rule that is applied to all grouping variables, or a cell array containing a binning method for each grouping variable, such as {'none',[0 2 4 Inf]}.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: G = groupcounts(T,groupvars,groupbins,'IncludedEdge','right')

Included bin edge, specified as either 'left' or 'right', indicating which end of the bin interval is inclusive.

This name-value pair can only be specified when groupbins is specified, and the value is applied to all binning schemes for all grouping variables or vectors.

Missing groups indicator, specified as a numeric or logical 1 (true) or 0 (false). If the parameter value is true, then groupcounts displays groups made up of missing values, such as NaN. If the parameter value is false, then groupcounts does not display the missing value groups.

Empty groups indicator, specified as a numeric or logical 0 (false) or 1 (true). If the parameter value is false, then groupcounts does not display groups with zero elements. If the parameter value is true, then groupcounts displays the empty groups.

Output Arguments

collapse all

Output table, returned as a table containing the computed groups and the number of elements in each group. For a single grouping variable, the output groups are sorted according to the order returned by the unique function with the 'sorted' option.

Group counts for non-table input data, returned as a column vector containing the number of elements in each group.

Groups for non-table input data, returned as a column vector or cell array of column vectors. For a single grouping vector, the output groups are sorted according to the order returned by the unique function with the 'sorted' option.

When you provide more than one input vector, BG is a cell array containing column vectors of equal length. The group information can be found by looking at the elements rowwise across all vectors in BG. The count for each group is contained in the corresponding row of the first output argument B.

Tips

  • When making many calls to groupcounts, consider converting grouping variables to type categorical or logical when possible for improved performance. For example, if you have a grouping variable of type char (such as Gender with elements 'Male' and 'Female'), you can convert it to a categorical variable using the command categorical(Gender).

Extended Capabilities

Introduced in R2019a