Class: matlab.compiler.mlspark.RDD Package: matlab.compiler.mlspark
Return the number of partitions in an RDD
numPartitions = getNumPartitions(obj)
numPartitions = getNumPartitions(obj) returns the number of partitions in obj.
numPartitions
obj
expand all
RDD
An input RDD, specified as an RDD object.
Number of partitions in the input RDD, returned as a scalar value.
Use the getNumPartitions method to return the number of partitions in an RDD.
getNumPartitions
%% Connect to Spark sparkProp = containers.Map({'spark.executor.cores'}, {'1'}); conf = matlab.compiler.mlspark.SparkConf('AppName','myApp', ... 'Master','local[1]','SparkProperties',sparkProp); sc = matlab.compiler.mlspark.SparkContext(conf); %% getNumPartitions inputRDD = sc.parallelize({'A','B','C','A','B'},2); redRDD= inputRDD.map(@(x)({x,1})).reduceByKey(@(x,y)(x+y),3); coaRDD = redRDD.coalesce(2); % {{{'B',2}},{{'C',1},{'A',2}}}* disp(['Number of Partitions: ' num2str(coaRDD.getNumPartitions())]);
coalesce | getDefaultReducePartitions | map | parallelize
coalesce
getDefaultReducePartitions
map
parallelize