Sankey plots


Sankey plots are often used to illustrate flows, for example treatment changes in medicine. The plot above illustrates the treatment changes in asthma patients. Some small flows are omitted.

Thanks to network3D package doing sankey plots has become fun! By implementing java scripts they can be interactive.

There are however some traps when using this package. The major one is the changing colors from plot to plot. We need to pass them via d3.scaleOrdinal() function and take care of the missing ones. You can use the following code:


colors.list<-c("#FFC312","#C4E538","#12CBC4","#FDA7DF","#ED4C67",
    "#F79F1F","#A3CB38","#1289A7","#D980FA","#B53471",
    "#EE5A24","#009432","#0652DD","#9980FA","#833471",
    "#EA2027","#006266","#1B1464","#5758BB","#6F1E51",
    "#747d8c")
  
  ##define 21 of types for colors 
  types.list<-c("type 0","type 1","type 2","type 3",
                "type 4", "type 5 ","type 6 ","type 7", "type 8 ","type 9", 
                "type 10","type 11", "type 12", "type 13","type 14","type 15",
                "type 16","type 17","type 18","type 19","type 20")
  
  
  colors <- paste(colors.list[which.not.missing], collapse = '", "')
  types <- paste(types.list[which.not.missing], collapse = '", "')
  ##prepare colors to JS form in sankeyNetwork function
  colorJS <- paste('d3.scaleOrdinal() .domain(["', types, '"]).range(["', colors, '"])')

Then pass it via sankeyNetwork:

sankeyNetwork(Links = links, Nodes = nodes,
Source = "source", Target = "target",
Value = "value", NodeID = "name",
fontSize= 12, nodeWidth = 10,
colourScale=colorJS,NodeGroup="group",
LinkGroup ="group")

It is a nice and very handy idea to combine multistate models approach to create the flows of interest (see etm package in R!).

library(etm)
  tr.prob.st <- etm(data[data$time>0,], c("1", "2", "3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22"), tra, s=start,t=stop,cens.name = NULL)
  
  ##now we calculate probabilities from start to stop time
  probst<-summary(tr.prob.st)
  probst<-do.call(rbind,probst)
  for.plot<-probst[probst$time==stop,]
  ##Now we can make links for sankey plot
  links.st<-round(for.plot$P*100,digits=0)

Since the outcome sankey plot is a widget, you can save it using the following code:

p<-sankeyNetwork(Links = links, Nodes = nodes,
                Source = "source", Target = "target",
                Value = "value", NodeID = "name",
                fontSize= 12, nodeWidth = 10,
                colourScale=my_color,NodeGroup="group",
                LinkGroup ="group")
saveWidget(p, pat, selfcontained = F, libdir = "lib")

Selfcontained=F will save it in the library together with the dependent java scripts. You can choose the Self-contained=T option and then the script will be included in the resulting html file.

Unfortunately there is no automatic way of obtaining numbers next to the node names. Therefore, you need to use a trick and paste the numbers as the part of the node name using paste() function.

The next nice thing about the network3D package is that it can be easily implemented in Shiny, so you can have your own dashboard for creating sankey plots.

If you need any help message me in the comments!

Have fun!

Leave a Reply

Your email address will not be published.

en_USEnglish
Free WordPress Themes