Package 'mimiSBM' reference manual

Title:	Mixture of Multilayer Integrator Stochastic Block Models
Description:	Our approach uses a mixture of multilayer stochastic block models to group co-membership matrices with similar information into components and to partition observations into different clusters. See De Santiago (2023, ISBN: 978-2-87587-088-9).
Authors:	Kylliann De Santiago [aut, cre], Marie Szafranski [aut], Christophe Ambroise [aut]
Maintainer:	Kylliann De Santiago <[email protected]>
License:	GPL-3
Version:	0.0.1.3
Built:	2025-02-27 03:00:31 UTC
Source:	https://github.com/cran/mimiSBM

mimiSBM model for fixed K and Q

Description

mimiSBM model for fixed K and Q

Usage

BayesianMixture_SBM_model(
  A,
  K,
  Q,
  beta_0 = rep(1/2, K),
  theta_0 = rep(1/2, Q),
  eta_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  xi_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  tol = 0.001,
  iter_max = 10,
  n_init = 1,
  alternate = TRUE,
  Verbose = TRUE,
  eps_conv = 1e-04,
  type_init = "SBM",
  nbCores = 2
)
BayesianMixture_SBM_model(
  A,
  K,
  Q,
  beta_0 = rep(1/2, K),
  theta_0 = rep(1/2, Q),
  eta_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  xi_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  tol = 0.001,
  iter_max = 10,
  n_init = 1,
  alternate = TRUE,
  Verbose = TRUE,
  eps_conv = 1e-04,
  type_init = "SBM",
  nbCores = 2
)

Arguments

`A`	an array of dim=c(N,N,V)
`K`	number of clusters
`Q`	number of components
`beta_0`	hyperparameters for beta
`theta_0`	hyperparameters for theta
`eta_0`	hyperparameters for eta
`xi_0`	hyperparameters for xi
`tol`	convergence parameter on ELBO
`iter_max`	maximal number of iteration of mimiSBM
`n_init`	number of initialization of the mimi algorithm.
`alternate`	boolean indicated if we put an M-step after each part of the E-step, after u optimization and after tau optimization. If not, we optimize u and tau and after the M-step is made.
`Verbose`	boolean for information on model fitting
`eps_conv`	parameter of convergence for tau.
`type_init`	select the type of initialization type_init=c("SBM","Kmeans","random")
`nbCores`	the number of cores used to parallelize the calculations See the vignette for more details.

Value

model with estimation of coefficients.

Clustering Matrix : One hot encoding

Description

Clustering Matrix : One hot encoding

Usage

CEM(Z)
CEM(Z)

Arguments

`Z`	a matrix N x K, with probabilities to belong of a cluster in rows for each observation.

Value

Z a matrix N x K One-Hot-Encoded by rows, where K is the number of clusters.

Examples

Z <- matrix(rnorm(12),3,4)
Z_cem <- CEM(Z)
print(Z_cem)
Z <- matrix(rnorm(12),3,4)
Z_cem <- CEM(Z)
print(Z_cem)

Diagonal coefficient to 0 on each slice given the 3rd dimension.

Description

Diagonal coefficient to 0 on each slice given the 3rd dimension.

Usage

diag_nulle(A)
diag_nulle(A)

Arguments

`A`	a array of dimension dim=c(N,N,V)

Value

A with 0 on each diagonal given the 3rd dimension.

SBM on each layer

Description

SBM on each layer

Usage

fit_SBM_per_layer(A, silent = FALSE, ncores = 2)
fit_SBM_per_layer(A, silent = FALSE, ncores = 2)

Arguments

`A`	an array of dim=c(N,N,V)
`silent`	Boolean for verbose
`ncores`	the number of cores used to parallelize the calculations of the various SBMs

Value

a list containing the parameters of each SBM applied to each view

SBM on each layer - parallelized

Description

SBM on each layer - parallelized

Usage

fit_SBM_per_layer_parallel(A, nbCores = 2)
fit_SBM_per_layer_parallel(A, nbCores = 2)

Arguments

`A`	an array of dim=c(N,N,V)
`nbCores`	the number of cores used to parallelize the calculations of the various SBMs

Value

a list containing the parameters of each SBM applied to each view

Initialization of mimiSBM parameters

Description

Initialization of mimiSBM parameters

Usage

initialisation_params_bayesian(
  A,
  K,
  Q,
  beta_0 = rep(1/2, K),
  theta_0 = rep(1/2, Q),
  eta_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  xi_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  type_init = "SBM",
  nbCores = 2
)
initialisation_params_bayesian(
  A,
  K,
  Q,
  beta_0 = rep(1/2, K),
  theta_0 = rep(1/2, Q),
  eta_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  xi_0 = array(rep(1/2, K * K * Q), c(K, K, Q)),
  type_init = "SBM",
  nbCores = 2
)

Arguments

`A`	an array of dim=c(N,N,V)
`K`	Number of clusters
`Q`	Number of components
`beta_0`	hyperparameters for beta
`theta_0`	hyperparameters for theta
`eta_0`	hyperparameters for eta
`xi_0`	hyperparameters for xi
`type_init`	select the type of initialization type_init=c("SBM","Kmeans","random")
`nbCores`	the number of cores used to parallelize the calculations of the various SBMs

Value

a list params updated

Label Switching

Description

This function can be used to perturb a clustering vector in order to randomly associate certain individuals with another cluster.

Usage

lab_switching(Z, p_out = 0.1)
lab_switching(Z, p_out = 0.1)

Arguments

`Z`	a clustering vector
`p_out`	a probability of perturbation for the clustering

Value

a perturbed clustering vector

Examples

Z <- sample(1:4,100,replace=TRUE)
p = 0.1
Z_pert <- lab_switching(Z,p)
table("Initial clustering" = Z,"Perturbed clustering" = Z_pert)
Z <- sample(1:4,100,replace=TRUE)
p = 0.1
Z_pert <- lab_switching(Z,p)
table("Initial clustering" = Z,"Perturbed clustering" = Z_pert)

log softmax of matrices (by row)

Description

log softmax of matrices (by row)

Usage

log_Softmax(log_X)
log_Softmax(log_X)

Arguments

log_X

a matrix of log(X)

Value

X with log_softmax function applied on each row

Examples

set.seed(42)
X <- matrix(rnorm(15,mean=5),5,3)
log_X <- log(X)
X_softmax <-  log_Softmax(X)
set.seed(42)
X <- matrix(rnorm(15,mean=5),5,3)
log_X <- log(X)
X_softmax <-  log_Softmax(X)

mimiSBM Evidence Lower BOund

Description

mimiSBM Evidence Lower BOund

Usage

Loss_BayesianMSBM(params)
Loss_BayesianMSBM(params)

Arguments

params

a list of parameters of the model

Value

computation of the mimiSBM ELBO

Create probality-component list for clustering per view component.

Description

Create probality-component list for clustering per view component.

Usage

Mat_lien_alpha(clusters, K_barre, K)
Mat_lien_alpha(clusters, K_barre, K)

Arguments

`clusters`	list of link between final clustering and clustering per view component.
`K_barre`	Number of clusters in the final clustering
`K`	Vector of size Q, indicate the number of clusters in each component.

Value

alpha : probality-component list for clustering per view component.

Mixture of Multilayer Integrator SBM (mimiSBM)

Description

Model that allows both clustering of individuals and grouping of views by component. This bayesian model estimates the probability of individuals belonging to each cluster (cluster crossing all views) and the membership component for all views. In addition, the connectivity tensor between classes, conditional on the components, is also estimated.

Usage

mimiSBM(
  A,
  Kset,
  Qset,
  beta_0 = 1/2,
  theta_0 = 1/2,
  eta_0 = 1/2,
  xi_0 = 1/2,
  criterion = "ILVB",
  tol = 0.001,
  iter_max = 10,
  n_init = 1,
  alternate = FALSE,
  Verbose = FALSE,
  eps_conv = 1e-04,
  type_init = "SBM"
)
mimiSBM(
  A,
  Kset,
  Qset,
  beta_0 = 1/2,
  theta_0 = 1/2,
  eta_0 = 1/2,
  xi_0 = 1/2,
  criterion = "ILVB",
  tol = 0.001,
  iter_max = 10,
  n_init = 1,
  alternate = FALSE,
  Verbose = FALSE,
  eps_conv = 1e-04,
  type_init = "SBM"
)

Arguments

`A`	an array of dim=c(N,N,V)
`Kset`	Set of number of clusters
`Qset`	Set of number of components
`beta_0`	hyperparameters for beta
`theta_0`	hyperparameters for theta
`eta_0`	hyperparameters for eta
`xi_0`	hyperparameters for xi
`criterion`	model selection criterion, criterion=c("ILVB","ICL_approx","ICL_variationnel","ICL_exact")
`tol`	convergence parameter on ELBO
`iter_max`	maximal number of iteration of mimiSBM
`n_init`	number of initialization of the mimi algorithm.
`alternate`	boolean indicated if we put an M-step after each part of the E-step, after u optimization and after tau optimization. If not, we optimize u and tau and after the M-step is made.
`Verbose`	boolean for information on model fitting
`eps_conv`	parameter of convergence for tau.
`type_init`	select the type of initialization type_init=c("SBM","Kmeans","random")

Value

The best model, conditionnally to the criterion, and its parameters.

Examples

set.seed(42)
K = c(2,3); pi_k = rep(1/4,4) ; rho = rep(1/2,2)
res <- rSMB_partition(N = 50,V = 5,K = K ,pi_k = pi_k ,rho = rho,p_switch = 0.1)
A = res$simulation$A ; Kset = 4 ; Qset = 2
model <- mimiSBM(A,Kset,Qset,n_init = 1, Verbose=FALSE)
set.seed(42)
K = c(2,3); pi_k = rep(1/4,4) ; rho = rep(1/2,2)
res <- rSMB_partition(N = 50,V = 5,K = K ,pi_k = pi_k ,rho = rho,p_switch = 0.1)
A = res$simulation$A ; Kset = 4 ; Qset = 2
model <- mimiSBM(A,Kset,Qset,n_init = 1, Verbose=FALSE)

Calculation of Log multinomial Beta value.

Description

Calculation of Log multinomial Beta value.

Usage

multinomial_lbeta_function(x)
multinomial_lbeta_function(x)

Arguments

x

a vector

Value

sum(lgamma(x[j])) - lgamma(sum(x))

One Hot Encoding with Error machine

Description

One Hot Encoding with Error machine

Usage

one_hot_errormachine(Z, size = NULL)
one_hot_errormachine(Z, size = NULL)

Arguments

`Z`	a vector of size N, where Z[i] value indicate the cluster membership of observation i.
`size`	optional parameter, indicating the number of classes (avoid some empty class problems).

Value

Z a matrix N x K One-Hot-Encoded by rows, where K is the number of clusters.

Examples

Z <- sample(1:4,10,replace=TRUE)
Z_OHE <- one_hot_errormachine(Z)
print(Z_OHE)
Z <- sample(1:4,10,replace=TRUE)
Z_OHE <- one_hot_errormachine(Z)
print(Z_OHE)

Create a link between final clustering and clustering per view component.

Description

Create a link between final clustering and clustering per view component.

Usage

partition_K_barre(K_barre, K)
partition_K_barre(K_barre, K)

Arguments

`K_barre`	Number of clusters in the final clustering
`K`	Vector of size Q, indicate the number of clusters in each component. K[q] <= K_barre for all q

Value

cluster : a list of link between final clustering and clustering per view component.

Plot adjacency matrices

Description

A function to plot each adjacency matrices defined by the thrid dimension of an array, and plot the sum of all theses matrices.

Usage

plot_adjacency(A)
plot_adjacency(A)

Arguments

`A`	an array with dim=c(N,N,V).

Value

None

Simulate data from the mimiSBM generative model.

Description

Simulate data from the mimiSBM generative model.

Usage

rMSBM(N, V, alpha_klq, pi_k, rho, sorted = TRUE, p_switch = NULL)
rMSBM(N, V, alpha_klq, pi_k, rho, sorted = TRUE, p_switch = NULL)

Arguments

`N`	Number of individuals.
`V`	Number of views.
`alpha_klq`	array of component-connection probability (K,K,Q).
`pi_k`	Vector of proportions of individuals across clusters.
`rho`	Vector of proportion of views across components.
`sorted`	Boolean for simulation reordering (clusters and components membership).
`p_switch`	probability of label-switching, if NULL no perturbation between true clustering and the connectivity of individuals.

Value

list with the parameters of the simulation ($params), and the simulations ($simulation).

Simulation of mixture multilayer Stochastick block model

Description

This simulation process assumes that we have partial information on the clustering within each view component, and that the final clustering of individuals depends on a combination of the clustering on each of the views. In addition, we take into account possible label-switching: we consider that an individual belongs with a certain probability to the wrong class, thus disturbing the adjacency matrices and making the simulation more real and complex.

Usage

rSMB_partition(N, V, K, pi_k, rho, sorted = TRUE, p_switch = NULL)
rSMB_partition(N, V, K, pi_k, rho, sorted = TRUE, p_switch = NULL)

Arguments

`N`	Number of observations
`V`	Number of views
`K`	Vector of size Q, indicate the number of clusters in each component.
`pi_k`	Vector of proportions of observations across clusters.
`rho`	Vector of proportion of views across components.
`sorted`	Boolean for simulation reordering (clusters and components membership).
`p_switch`	probability of label-switching, if NULL no perturbation between true clustering and the connectivity of individuals.

Details

See the vignette for more information.

Value

list with the parameters of the simulation ($params), and the simulations ($simulation).

Sort the clustering matrix

Description

Sort the clustering matrix

Usage

sort_Z(Z)
sort_Z(Z)

Arguments

`Z`	a matrix N x K, with probabilities to belong of a cluster in rows for each observation.

Value

a sorted matrix

Transposition of an array

Description

Transposition of an array

Usage

transpo(A)
transpo(A)

Arguments

`A`	a array of dim= c(.,.,V)

Value

A_transposed, the transposed array according the third dimension

Upper triangular Matrix/Array

Description

Upper triangular Matrix/Array

Usage

trig_sup(A, transp = FALSE, diag = TRUE)
trig_sup(A, transp = FALSE, diag = TRUE)

Arguments

`A`	a array or a squared matrix
`transp`	boolean, indicate if we need a transposition or not.
`diag`	boolean, if True, diagonal is not used.

Value

a array or a squared matrix, with only upper-triangular coefficients with non-zero values

Update of bayesian parameter beta

Description

Update of bayesian parameter beta

Usage

update_beta_bayesian(params)
update_beta_bayesian(params)

Arguments

params

list of parameters of the model

Value

params with beta updated

Update of bayesian parameter eta

Description

Update of bayesian parameter eta

Usage

update_eta_bayesian(A, params)
update_eta_bayesian(A, params)

Arguments

`A`	an array of dim=c(N,N,V)
`params`	list of parameters of the model

Value

params with eta updated

Update of bayesian parameter tau

Description

Update of bayesian parameter tau

Usage

update_tau_bayesian(A, params, eps_conv = 1e-04)
update_tau_bayesian(A, params, eps_conv = 1e-04)

Arguments

`A`	an array of dim=c(N,N,V)
`params`	list of parameters of the model
`eps_conv`	parameter of convergence.

Value

params with tau updated

Update of bayesian parameter theta

Description

Update of bayesian parameter theta

Usage

update_theta_bayesian(params)
update_theta_bayesian(params)

Arguments

params

list of parameters of the model

Value

params with theta updated

Update of bayesian parameter u

Description

Update of bayesian parameter u

Usage

update_u_bayesian(A, params)
update_u_bayesian(A, params)

Arguments

`A`	an array of dim=c(N,N,V)
`params`	list of parameters of the model

Value

params with u updated

Update of bayesian parameter xi

Description

Update of bayesian parameter xi

Usage

update_xi_bayesian(A, params)
update_xi_bayesian(A, params)

Arguments

`A`	an array of dim=c(N,N,V)
`params`	list of parameters of the model

Value

params with xi updated

Variational Bayes Expectation Maximization

Description

Variational Bayes Expectation Maximization

Usage

VBEM_step(A, params, alternate = TRUE, eps_conv = 0.001)
VBEM_step(A, params, alternate = TRUE, eps_conv = 0.001)

Arguments

`A`	an array of dim=c(N,N,V)
`params`	list of parameters of the model
`alternate`	boolean indicated if we put an M-step after each part of the E-step, after u optimization and after tau optimization. If not, we optimize u and tau and after the M-step is made.
`eps_conv`	parameter of convergence for tau.

Value

params with updated parameters.

Package 'mimiSBM'

Help Index

mimiSBM model for fixed K and Q

Description

Usage

Arguments

Value

Clustering Matrix : One hot encoding

Description

Usage

Arguments

Value

Examples

Diagonal coefficient to 0 on each slice given the 3rd dimension.

Description

Usage

Arguments

Value

SBM on each layer

Description

Usage

Arguments

Value

SBM on each layer - parallelized

Description

Usage

Arguments

Value

Initialization of mimiSBM parameters

Description

Usage

Arguments

Value

Label Switching

Description

Usage

Arguments

Value

Examples

log softmax of matrices (by row)

Description

Usage

Arguments

Value

Examples

mimiSBM Evidence Lower BOund

Description

Usage

Arguments

Value

Create probality-component list for clustering per view component.

Description

Usage

Arguments

Value

Mixture of Multilayer Integrator SBM (mimiSBM)

Description

Usage

Arguments

Value

Examples

Calculation of Log multinomial Beta value.

Description

Usage

Arguments

Value

One Hot Encoding with Error machine

Description

Usage

Arguments

Value

Examples

Create a link between final clustering and clustering per view component.

Description

Usage

Arguments

Value

Plot adjacency matrices

Description

Usage