Group data into bins or categories
returns the indices of the bins that contain the elements of
Y
= discretize(X
,edges
)X
. The j
th bin contains element
X(i)
if edges(j) <= X(i) <
edges(j+1)
for 1 <= j < N
, where
N
is the number of bins and length(edges) =
N+1
. The last bin contains both edges such that edges(N)
<= X(i) <= edges(N+1)
.
[___] = discretize(___,
returns
the corresponding element in values
)values
rather than
the bin number, using any of the previous input or output argument
combinations. For example, if X(1)
is in bin 5,
then Y(1)
is values(5)
rather
than 5
. values
must be a vector
with length equal to the number of bins.
[___] = discretize(___,'categorical')
creates
a categorical array where each bin is a category. In most cases, the
default category names are of the form “[A,B)
”
(or “[A,B]
” for the last bin), where A
and B
are
consecutive bin edges. If you specify dur
as a
character vector, then the default category names might have special
formats. See Y
for a listing of the display formats.
[___] = discretize(___,'categorical',
,
for datetime or duration array inputs, uses the specified datetime
or duration display format in the category names of the output.displayFormat
)
[___] = discretize(___,'categorical',
also
names the categories in categoryNames
)Y
using the cell array
of character vectors, categoryNames
. The length
of categoryNames
must be equal to the number of
bins.
[___] = discretize(___,'IncludedEdge',
,
where side
)side
is 'left'
or
'right'
, specifies whether each bin includes its right or
left bin edge. For example, if side
is
'right'
, then each bin includes the right bin edge,
except for the first bin which includes both edges. In this
case, the j
th bin contains an element X(i)
if edges(j) < X(i) <= edges(j+1)
, where 1 <
j <= N
and N
is the number of bins. The
first bin includes the left edge such that it contains edges(1) <=
X(i) <= edges(2)
. The default for side
is
'left'
.
The behavior of discretize
is
similar to that of the histcounts
function. Use histcounts
to
find the number of elements in each bin. On the other hand, use discretize
to
find which bin each element belongs to (without counting).