This function identifies transitive clusters (i.e. connected components) as well as the number of members in each cluster, and adds this information to the linelist data.
get_clusters( x, output = c("epicontacts", "data.frame"), member_col = "cluster_member", size_col = "cluster_size", override = FALSE )
x | An |
---|---|
output | A character string indicating the type of output: either an
|
member_col | Name of column to which cluster membership is assigned to in the linelist. Default name is 'cluster_member'. |
size_col | Name of column to which cluster sizes are assigned to in the linelist. Default name is 'cluster_size'. |
override | Logical value indicating whether cluster member and size columns should be overwritten if they already exist in the linelist. Default is 'FALSE'. |
An epicontacts
object whose 'linelist' dataframe
contains new columns corresponding to cluster membership and size, or a
data.frame containing member ids, cluster memberships as factors,
and associated cluster sizes. All ids that were originally in the 'contacts'
dataframe but not in the linelist will also be added to the linelist.
Nistara Randhawa (nrandhawa@ucdavis.edu)
if (require(outbreaks)) { ## build data x <- make_epicontacts(ebola_sim$linelist, ebola_sim$contacts, id = "case_id", to = "case_id", from = "infector", directed = TRUE) ## add cluster membership and sizes to epicontacts 'linelist' y <- get_clusters(x, output = "epicontacts") y ## return a data.frame with linelist member ids and cluster memberships as ## factors z <- get_clusters(x, output = "data.frame") head(z) }#> cluster_member id cluster_size #> 1 1 d1fafd 2 #> 2 1 53371b 2 #> 3 2 f5c3d8 6 #> 4 2 900021 6 #> 5 2 0f58c4 6 #> 6 2 d58402 6