This assignment attempts to solve the 2021 IEEE Visual Analytics Science and Technology (VAST) Challenge: Mini-Challenge 2 by applying different visual analytics concepts, methods, and techniques with relevant R data visualisation and data analysis packages.
Given the data sources provided, identify potential informal or unofficial relationships among GASTech personnel. Provide evidence for these relationships.
Similar to question 3, identify the POIs by computing the difference of gps timestamp.
Afterwards, identify who are within ‘close contact’ of each employee based on the difference of their gps coordinates within the same time period.
This can help establish the relationship of GASTech personnel according to their meetings at the same place and at the same time.
gps_poi_network <- car_gps_data %>%
group_by(CarID) %>%
mutate(poi_diff = timestamp - lag(timestamp, order_by=CarID)) %>%
mutate(poi = if_else(poi_diff > 60*5, TRUE, FALSE)) %>%
filter(poi == TRUE) %>%
ungroup() %>%
mutate(lat_diff = lat - lag(lat, order_by=timestamp))%>%
mutate(long_diff = long - lag(long, order_by=timestamp)) %>%
mutate(close_contact = if_else(abs(lat_diff) <=0.001 & abs(long_diff) <=0.001, TRUE, FALSE))%>%
filter(close_contact == TRUE) %>%
ungroup()
glimpse(gps_poi_network)
Rows: 773
Columns: 15
$ timestamp <dttm> 2014-01-06 07:34:01, 2014-01-06 07:44:01, 201~
$ CarID <fct> 10, 12, 8, 13, 30, 22, 16, 107, 33, 10, 20, 19~
$ lat <dbl> 36.07333, 36.06365, 36.06365, 36.05408, 36.054~
$ long <dbl> 24.86418, 24.88593, 24.88594, 24.90125, 24.901~
$ date <dttm> 2014-01-06, 2014-01-06, 2014-01-06, 2014-01-0~
$ day <ord> Mon, Mon, Mon, Mon, Mon, Mon, Mon, Mon, Mon, M~
$ hour <int> 7, 7, 7, 8, 8, 8, 11, 11, 11, 11, 11, 11, 12, ~
$ Deparment <chr> "Executive", "Security", "Information Technolo~
$ Title <chr> "SVP/CIO", "Site Control", "IT Technician", "S~
$ FullName <chr> "Ada Campo-Corrente", "Hideki Cocinaro", "Luca~
$ poi_diff <drtn> 1980 secs, 1965 secs, 1307 secs, 2005 secs, 1~
$ poi <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE~
$ lat_diff <dbl> -1.833e-05, -5.626e-05, -6.738e-05, -2.531e-05~
$ long_diff <dbl> 7.720e-06, 3.232e-05, 1.306e-05, 2.260e-06, -6~
$ close_contact <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE~
employee_edges <- gps_poi_network %>%
group_by(date, hour)%>%
mutate(from = FullName) %>%
mutate(to = lead(FullName, order_by = timestamp)) %>%
ungroup() %>%
group_by(from,to) %>%
summarise(weight = n())
employee_nodes <- gps_poi_network %>%
select(FullName, Deparment, Title) %>%
rename(id = FullName) %>%
rename(group = Deparment) %>%
distinct()
visNetwork(employee_nodes,
employee_edges,
main = "Relationships among GASTech Personnel") %>%
visIgraphLayout(layout = "layout_with_fr") %>%
visEdges(arrows = "to",
smooth = list(enabled = TRUE,
type = "curvedCW")) %>%
visOptions(highlightNearest = TRUE,
nodesIdSelection = TRUE) %>%
visLegend() %>%
visLayout(randomSeed = 123)
The network diagram shows the ‘official’ relationship of employees based on their respective departments. It also show ‘unofficial’ relationship based on the number of their interactions.
From the network diagram, it can be seen that Isande Barrasca , a Drill Technician from the Engineering Department is an outlier. His only close contact to the rest of employees is Hideki Cocinaro , a Site Controller from the Security Department.
Similarly, Sten Sanjorge Jr. , IT Technician from the Information Technology Department, have minimal interactions with other employees and seems not well connected within the company.
The heatmap below visualizes the number of interactions between employees.
employee_interact1 <- full_join(employee_edges,
employee_nodes,
by = (c("from" = "id"))) %>%
rename(SenderDepartment = group) %>%
rename(SenderTitle = Title)
employee_interact2 <- full_join(employee_interact1,
employee_nodes,
by = (c("to" = "id"))) %>%
rename(ReceiverDepartment = group) %>%
rename(ReceiverTitle = Title) %>%
rename(Sender = from) %>%
rename(Receiver = to)
employee_interaction <- ggplot(data = employee_interact2,
aes(x=Sender, y=Receiver,
fill = weight,
text = paste("Sender :", Sender,"\n",
"Sender Department:", SenderDepartment, "\n",
"Sender Title:", SenderTitle, "\n",
"\n",
"Receiver", Receiver,"\n",
"Receiver Department:", ReceiverDepartment, "\n",
"Receiver Title:", ReceiverTitle, "\n",
"\n",
"Number of Meetings", weight))) +
geom_tile()+
scale_fill_gradient(low = "lightsteelblue1", high = "royalblue4") +
ggtitle("GAStech Personnel Relationship based on number of Interactions") +
labs(x = "Sender Employee", y = "Receiver Employee") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90))
ggplotly(employee_interaction, tooltip = "text")
The highest number of meetings among GAStech Personnel are truck drivers with 23 interactions. Employee names are set to NA since the CarID is not identified. The second highest number of meetings is from Bertrand Ovan, Group Manager of Facilities department with 14 meetings.
The third highest number of meetings is from Ingrid Barranco, SVP/CFO from Executive department with 12 meetings.
Click HERE to view the Visual Detective Assignment Part 5.
For attribution, please cite this work as
Dolit (2021, July 25). Visual Analytics & Applications: Visual Detective Assignment Part 4. Retrieved from https://adolit-vaa.netlify.app/posts/2021-07-26-assignment-4/
BibTeX citation
@misc{dolit2021visual, author = {Dolit, Archie}, title = {Visual Analytics & Applications: Visual Detective Assignment Part 4}, url = {https://adolit-vaa.netlify.app/posts/2021-07-26-assignment-4/}, year = {2021} }