Basic Syntax (2)

Array

Weekly design


Pre-class video


Array



# Create N-dimensional array

# Assign values 1 to 5 to a 2×4 matrix
x = array(1:5, c(2, 4)) 

x
     [,1] [,2] [,3] [,4]
[1,]    1    3    5    2
[2,]    2    4    1    3
# Print row 1 element value
x[1, ] 
[1] 1 3 5 2
# Print 2nd column element values
x[, 2] 
[1] 3 4
# Set row and column names
dimnamex = list(c("1st", "2nd"), c("1st", "2nd", "3rd", "4th")) 

x = array(1:5, c(2, 4), dimnames = dimnamex)
x
    1st 2nd 3rd 4th
1st   1   3   5   2
2nd   2   4   1   3
x["1st", ]
1st 2nd 3rd 4th 
  1   3   5   2 
x[, "4th"]
1st 2nd 
  2   3 
# Create a two-dimensional array
x = 1:12
x
 [1]  1  2  3  4  5  6  7  8  9 10 11 12
matrix(x, nrow = 3)
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12
matrix(x, nrow = 3, byrow = T)
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8
[3,]    9   10   11   12
# Create an array by combining vectors
v1 = c(1, 2, 3, 4)
v2 = c(5, 6, 7, 8)
v3 = c(9, 10, 11, 12)

# Create an array by binding by column
cbind(v1, v2, v3) 
     v1 v2 v3
[1,]  1  5  9
[2,]  2  6 10
[3,]  3  7 11
[4,]  4  8 12
# Create array by binding row by row
rbind(v1, v2, v3) 
   [,1] [,2] [,3] [,4]
v1    1    2    3    4
v2    5    6    7    8
v3    9   10   11   12
# Various matrix operations using the operators in [Table 3-7]
# Store two 2×2 matrices in x and y, respectively
x = array(1:4, dim = c(2, 2))
y = array(5:8, dim = c(2, 2))
x
     [,1] [,2]
[1,]    1    3
[2,]    2    4
y
     [,1] [,2]
[1,]    5    7
[2,]    6    8
x+y
     [,1] [,2]
[1,]    6   10
[2,]    8   12
x-y
     [,1] [,2]
[1,]   -4   -4
[2,]   -4   -4
# multiplication for each column
x * y 
     [,1] [,2]
[1,]    5   21
[2,]   12   32
# mathematical matrix multiplication
x %*% y 
     [,1] [,2]
[1,]   23   31
[2,]   34   46
# transpose matrix of x
t(x) 
     [,1] [,2]
[1,]    1    2
[2,]    3    4
# inverse of x
solve(x) 
     [,1] [,2]
[1,]   -2  1.5
[2,]    1 -0.5
# determinant of x
det(x) 
[1] -2
x = array(1:12, c(3, 4))
x
     [,1] [,2] [,3] [,4]
[1,]    1    4    7   10
[2,]    2    5    8   11
[3,]    3    6    9   12
# If the center value is 1, apply the function row by row
apply(x, 1, mean) 
[1] 5.5 6.5 7.5
# If the center value is 2, apply the function to each column
apply(x, 2, mean) 
[1]  2  5  8 11
x = array(1:12, c(3, 4))
dim(x)
[1] 3 4
x = array(1:12, c(3, 4))

# Randomly mix and extract array elements
sample(x) 
 [1]  6  7  8 12  9  3  1  5  4 11 10  2
# Select and extract 10 elements from the array
sample(x, 10) 
 [1]  2  1  5  9  7 12  3  8  4 11
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
# ?sample

# The extraction probability for each element can be varied
sample(x, 10, prob = c(1:12)/24) 
 [1] 12  5 11  8 10  4  6  2  3  9
# You can create a sample using just numbers
sample(10) 
 [1]  2  3  6  4  8  7  1  5  9 10

Class


  • Create a new project

    • *.Rproj

    • *.R

    • getwd()

  • Variable and Object

    • An object in R is a data structure used for storing data: Everything in R is an object, including functions, numbers, character strings, vectors, and lists. Each object has attributes such as its type (e.g., integer, numeric, character), its length, and often its dimensions. Objects can be complex structures, like data frames that hold tabular data, or simpler structures like a single numeric value or vector.

    • A variable in R is a name that you assign to an object so that you can refer to it later in your code. When you assign data to a variable, you are effectively labeling that data with a name that you can use to call up the object later on.

Here’s a simple example in R:

my_vector <- c(1, 2, 3)
  • my_vector is a variable. It’s a symbolic name that we’re using to refer to some data we’re interested in.

  • c(1, 2, 3) creates a vector object containing the numbers 1, 2, and 3.

  • This vector is the object, and it’s the actual data structure that R is storing in memory.

# remove all objects stored
rm()

# Create a vector 1 to 10
1:10
 [1]  1  2  3  4  5  6  7  8  9 10
# Sampling 10 values from the vector 1:10
sample(1:10, 10)
 [1]  8  4  5  9 10  7  6  1  2  3
X <- sample(1:10, 10)
# Extract 2nd to 5th elements of X
X[2:5]
[1]  5 10  3  8


  • Vectorized codes
c(1, 2, 4) + c(2, 3, 5)
[1] 3 5 9


X <- c(1,2,4,5)

X * 2
[1]  2  4  8 10
  • Recycling rule
1:4 + c(1, 2)
[1] 2 4 4 6
X<-c(1,2,4,5)
X * 2
[1]  2  4  8 10
1:4 + 1:3
Warning in 1:4 + 1:3: longer object length is not a multiple of shorter object
length
[1] 2 4 6 5

Pop-up Qz

Choose two if its type cannot be ‘factor’ variable in R

  1. GPA

  2. Blood type

  3. Grade (A,B,C,D,F)

  4. Height

  5. Gender


What is the result of the following R code?

my_vector <- c(3.5, -1.6, TRUE, "R")
class(my_vector)
[1] "character"
  1. “numeric”

  2. “logical”

  3. “character”

  4. “complex”


Consider the following R code. Which of the following is the correct way to access the second element of the my_vector?

my_vector <- c(10, "20", 30)
sum(as.numeric(my_vector))
[1] 60
  1. 60

  2. “60”

  3. 40

  4. An error


Understanding Arrays in R: Concepts and Examples

Arrays are a fundamental data structure in R that extend vectors by allowing you to store multi-dimensional data. While a vector has one dimension, arrays in R can have two or more dimensions, making them incredibly versatile for complex data organization.

What is an Array in R?

An array in R is a collection of elements of the same type arranged in a grid of a specified dimensionality. It is a multi-dimensional data structure that can hold values in more than two dimensions. Arrays are particularly useful in scenarios where operations on multi-dimensional data are required, such as matrix computations, tabulations, and various applications in data analysis and statistics.

Creating an Array

To create an array in R, you can use the array function. This function takes a vector of data and a vector of dimensions as arguments. For example:

# Create a 2x3 array
my_array <- array(1:6, dim = c(2, 3))
print(my_array)
     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

This code snippet creates a 2x3 array (2 rows and 3 columns) with the numbers 1 to 6.

Accessing Array Elements

Elements within an array can be accessed using indices for each dimension in square brackets []. For example:

# Access the element in the 1st row and 2nd column
element <- my_array[1, 2]
print(element)
[1] 3

Modifying Arrays

Just like vectors, you can modify the elements of an array by accessing them using their indices and assigning new values. For example:

# Modify the element in the 1st row and 2nd column to be 20
my_array[1, 2] <- 20
print(my_array)
     [,1] [,2] [,3]
[1,]    1   20    5
[2,]    2    4    6

Operations on Arrays

R allows you to perform operations on arrays. These operations can be element-wise or can involve the entire array. For example, you can add two arrays of the same dimensions, and R will perform element-wise addition.

Example: Creating and Manipulating a 3D Array

# Create a 3x2x2 array
my_3d_array <- array(1:12, dim = c(3, 2, 2))
print(my_3d_array)
, , 1

     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

, , 2

     [,1] [,2]
[1,]    7   10
[2,]    8   11
[3,]    9   12
# Access an element (2nd row, 1st column, 2nd matrix)
element_3d <- my_3d_array[2, 1, 2]
print(element_3d)
[1] 8


Quiz: Test Your Understanding of Arrays in R


Question 1: What is the output when accessing the third element in the second row and first column of a 3x3x3 array filled with elements from 1 to 27?

A) 3
B) 12
C) 21
D) 9


Question 2: Which of the following statements creates a 2x2x3 array containing the numbers 1 through 12 in R?

A) array(1:12, dim = c(2, 2, 3))
B) matrix(1:12, nrow = 2, ncol = 2)
C) c(1:12)
D) array(1:12, dim = c(3, 2, 2))


Question 3: How do you modify the element at position [1, 1, 1] in a 3-dimensional array named ‘arr’ to have a value of 100?

A) arr[1] <- 100
B) arr[1, 1, 1] <- 100
C) arr[c(1, 1, 1)] <- 100
D) Both B and C are correct.


Answers:

Answer 1: B) 12
Explanation: Arrays in R are filled column-wise, so the third element in the second row and first column of the second matrix would be 12.

Answer 2: A) array(1:12, dim = c(2, 2, 3))
Explanation: The array function with dimension argument c(2, 2, 3) will create a 2x2x3 array, filling the elements from 1 to 12 across the dimensions.

Answer 3: B) arr[1, 1, 1] <- 100
Explanation: To modify a specific element in an array, you need to specify all its indices. The correct way is arr[1, 1, 1] <- 100.


COV19 matrix and visualization

Data import

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.0     ✔ readr     2.1.4
✔ ggplot2   3.4.4     ✔ stringr   1.5.0
✔ lubridate 1.9.3     ✔ tibble    3.2.1
✔ purrr     1.0.2     ✔ tidyr     1.3.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(rvest)

Attaching package: 'rvest'

The following object is masked from 'package:readr':

    guess_encoding
library(stringdist)

Attaching package: 'stringdist'

The following object is masked from 'package:tidyr':

    extract
library(reshape2)

Attaching package: 'reshape2'

The following object is masked from 'package:tidyr':

    smiths

Import Johns-Hopkins covid19 data

clean_jhd_to_long <- function(df) {
  df_str <- deparse(substitute(df))
  var_str <- substr(df_str, 1, str_length(df_str) - 4)
  
  df %>% group_by(`Country/Region`) %>%
    filter(`Country/Region` != "Cruise Ship") %>%
    select(-`Province/State`, -Lat, -Long) %>%
    mutate_at(vars(-group_cols()), sum) %>% 
    distinct() %>%
    ungroup() %>%
    rename(country = `Country/Region`) %>%
    pivot_longer(
      -country, 
      names_to = "date_str", 
      values_to = var_str
    ) %>%
    mutate(date = mdy(date_str)) %>%
    select(country, date, !! sym(var_str)) 
}

confirmed_raw <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv")
Rows: 289 Columns: 1147
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr    (2): Province/State, Country/Region
dbl (1145): Lat, Long, 1/22/20, 1/23/20, 1/24/20, 1/25/20, 1/26/20, 1/27/20,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
deaths_raw <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv")
Rows: 289 Columns: 1147
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr    (2): Province/State, Country/Region
dbl (1145): Lat, Long, 1/22/20, 1/23/20, 1/24/20, 1/25/20, 1/26/20, 1/27/20,...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

See the data

head(confirmed_raw, 10)
# A tibble: 10 × 1,147
   `Province/State`  `Country/Region`   Lat   Long `1/22/20` `1/23/20` `1/24/20`
   <chr>             <chr>            <dbl>  <dbl>     <dbl>     <dbl>     <dbl>
 1 <NA>              Afghanistan       33.9  67.7          0         0         0
 2 <NA>              Albania           41.2  20.2          0         0         0
 3 <NA>              Algeria           28.0   1.66         0         0         0
 4 <NA>              Andorra           42.5   1.52         0         0         0
 5 <NA>              Angola           -11.2  17.9          0         0         0
 6 <NA>              Antarctica       -71.9  23.3          0         0         0
 7 <NA>              Antigua and Bar…  17.1 -61.8          0         0         0
 8 <NA>              Argentina        -38.4 -63.6          0         0         0
 9 <NA>              Armenia           40.1  45.0          0         0         0
10 Australian Capit… Australia        -35.5 149.           0         0         0
# ℹ 1,140 more variables: `1/25/20` <dbl>, `1/26/20` <dbl>, `1/27/20` <dbl>,
#   `1/28/20` <dbl>, `1/29/20` <dbl>, `1/30/20` <dbl>, `1/31/20` <dbl>,
#   `2/1/20` <dbl>, `2/2/20` <dbl>, `2/3/20` <dbl>, `2/4/20` <dbl>,
#   `2/5/20` <dbl>, `2/6/20` <dbl>, `2/7/20` <dbl>, `2/8/20` <dbl>,
#   `2/9/20` <dbl>, `2/10/20` <dbl>, `2/11/20` <dbl>, `2/12/20` <dbl>,
#   `2/13/20` <dbl>, `2/14/20` <dbl>, `2/15/20` <dbl>, `2/16/20` <dbl>,
#   `2/17/20` <dbl>, `2/18/20` <dbl>, `2/19/20` <dbl>, `2/20/20` <dbl>, …

Countries in data

unique(confirmed_raw$`Country/Region`)
  [1] "Afghanistan"                      "Albania"                         
  [3] "Algeria"                          "Andorra"                         
  [5] "Angola"                           "Antarctica"                      
  [7] "Antigua and Barbuda"              "Argentina"                       
  [9] "Armenia"                          "Australia"                       
 [11] "Austria"                          "Azerbaijan"                      
 [13] "Bahamas"                          "Bahrain"                         
 [15] "Bangladesh"                       "Barbados"                        
 [17] "Belarus"                          "Belgium"                         
 [19] "Belize"                           "Benin"                           
 [21] "Bhutan"                           "Bolivia"                         
 [23] "Bosnia and Herzegovina"           "Botswana"                        
 [25] "Brazil"                           "Brunei"                          
 [27] "Bulgaria"                         "Burkina Faso"                    
 [29] "Burma"                            "Burundi"                         
 [31] "Cabo Verde"                       "Cambodia"                        
 [33] "Cameroon"                         "Canada"                          
 [35] "Central African Republic"         "Chad"                            
 [37] "Chile"                            "China"                           
 [39] "Colombia"                         "Comoros"                         
 [41] "Congo (Brazzaville)"              "Congo (Kinshasa)"                
 [43] "Costa Rica"                       "Cote d'Ivoire"                   
 [45] "Croatia"                          "Cuba"                            
 [47] "Cyprus"                           "Czechia"                         
 [49] "Denmark"                          "Diamond Princess"                
 [51] "Djibouti"                         "Dominica"                        
 [53] "Dominican Republic"               "Ecuador"                         
 [55] "Egypt"                            "El Salvador"                     
 [57] "Equatorial Guinea"                "Eritrea"                         
 [59] "Estonia"                          "Eswatini"                        
 [61] "Ethiopia"                         "Fiji"                            
 [63] "Finland"                          "France"                          
 [65] "Gabon"                            "Gambia"                          
 [67] "Georgia"                          "Germany"                         
 [69] "Ghana"                            "Greece"                          
 [71] "Grenada"                          "Guatemala"                       
 [73] "Guinea"                           "Guinea-Bissau"                   
 [75] "Guyana"                           "Haiti"                           
 [77] "Holy See"                         "Honduras"                        
 [79] "Hungary"                          "Iceland"                         
 [81] "India"                            "Indonesia"                       
 [83] "Iran"                             "Iraq"                            
 [85] "Ireland"                          "Israel"                          
 [87] "Italy"                            "Jamaica"                         
 [89] "Japan"                            "Jordan"                          
 [91] "Kazakhstan"                       "Kenya"                           
 [93] "Kiribati"                         "Korea, North"                    
 [95] "Korea, South"                     "Kosovo"                          
 [97] "Kuwait"                           "Kyrgyzstan"                      
 [99] "Laos"                             "Latvia"                          
[101] "Lebanon"                          "Lesotho"                         
[103] "Liberia"                          "Libya"                           
[105] "Liechtenstein"                    "Lithuania"                       
[107] "Luxembourg"                       "MS Zaandam"                      
[109] "Madagascar"                       "Malawi"                          
[111] "Malaysia"                         "Maldives"                        
[113] "Mali"                             "Malta"                           
[115] "Marshall Islands"                 "Mauritania"                      
[117] "Mauritius"                        "Mexico"                          
[119] "Micronesia"                       "Moldova"                         
[121] "Monaco"                           "Mongolia"                        
[123] "Montenegro"                       "Morocco"                         
[125] "Mozambique"                       "Namibia"                         
[127] "Nauru"                            "Nepal"                           
[129] "Netherlands"                      "New Zealand"                     
[131] "Nicaragua"                        "Niger"                           
[133] "Nigeria"                          "North Macedonia"                 
[135] "Norway"                           "Oman"                            
[137] "Pakistan"                         "Palau"                           
[139] "Panama"                           "Papua New Guinea"                
[141] "Paraguay"                         "Peru"                            
[143] "Philippines"                      "Poland"                          
[145] "Portugal"                         "Qatar"                           
[147] "Romania"                          "Russia"                          
[149] "Rwanda"                           "Saint Kitts and Nevis"           
[151] "Saint Lucia"                      "Saint Vincent and the Grenadines"
[153] "Samoa"                            "San Marino"                      
[155] "Sao Tome and Principe"            "Saudi Arabia"                    
[157] "Senegal"                          "Serbia"                          
[159] "Seychelles"                       "Sierra Leone"                    
[161] "Singapore"                        "Slovakia"                        
[163] "Slovenia"                         "Solomon Islands"                 
[165] "Somalia"                          "South Africa"                    
[167] "South Sudan"                      "Spain"                           
[169] "Sri Lanka"                        "Sudan"                           
[171] "Summer Olympics 2020"             "Suriname"                        
[173] "Sweden"                           "Switzerland"                     
[175] "Syria"                            "Taiwan*"                         
[177] "Tajikistan"                       "Tanzania"                        
[179] "Thailand"                         "Timor-Leste"                     
[181] "Togo"                             "Tonga"                           
[183] "Trinidad and Tobago"              "Tunisia"                         
[185] "Turkey"                           "Tuvalu"                          
[187] "US"                               "Uganda"                          
[189] "Ukraine"                          "United Arab Emirates"            
[191] "United Kingdom"                   "Uruguay"                         
[193] "Uzbekistan"                       "Vanuatu"                         
[195] "Venezuela"                        "Vietnam"                         
[197] "West Bank and Gaza"               "Winter Olympics 2022"            
[199] "Yemen"                            "Zambia"                          
[201] "Zimbabwe"                        

Create conf.case data.frame

confirmed_raw %>% 
  filter(`Country/Region` %in% c("China", "Italy", "Japan", "United Kingdom", "US", "Korea, South",
                                 "Spain")) %>%
  select(-c(`Province/State`, Lat, Long)) %>% 
  group_by(`Country/Region`) %>% summarise_all(sum) -> test

names(test)[1]<-"country"

melt(data = test, id.vars = "country", measure.vars = names(test)[-1]) %>% 
  separate(variable, into = c("mon", "day", "year"), sep='/', extra = "merge") %>% 
  filter(day %in% c(1)) %>%
  arrange(mon, day) %>% 
  mutate(date=as.Date(with(.,paste(mon, day, year, sep="/")), format = "%m/%d/%y")) %>% 
  dcast(country ~ date) -> df.conf.case
df.conf.case
         country 2020-02-01 2020-03-01 2020-04-01 2020-05-01 2020-06-01
1          China      11891      79932      84002      86850      87520
2          Italy          2       1694     110574     207428     233197
3          Japan         20        259       2535      14558      16778
4   Korea, South         12       3736       9887      10780      11541
5          Spain          1         84     104118     215216     239638
6 United Kingdom          2         94      43755     183500     258979
7             US          8         32     227903    1115972    1809384
  2020-07-01 2020-08-01 2020-09-01 2020-10-01 2020-11-01 2020-12-01 2021-01-01
1      88344      91690      94363      95568      97250      99336     102649
2     240760     247832     270189     317409     709335    1620901    2129376
3      18732      37790      69018      84212     101936     150857     239005
4      12904      14366      20449      23952      26732      35163      62593
5     249659     288522     470973     778607    1185678    1656444    1928265
6     285276     305558     339403     462780    1038056    1647165    2549671
7    2698127    4605921    6088458    7292562    9254490   13866746   20397398
  2021-02-01 2021-03-01 2021-04-01 2021-05-01 2021-06-01 2021-07-01 2021-08-01
1     107902     109034     110169     111325     112329     113614     115473
2    2560957    2938371    3607083    4035617    4220304    4260788    4355348
3     392533     433334     477691     598754     749126     801337     936852
4      78844      90372     104194     123240     141476     158549     201002
5    2822805    3204531    3291394    3524077    3682778    3821305    4447044
6    3846807    4194287    4364544    4434156    4506331    4844879    5907641
7   26482919   28814420   30656330   32516226   33407540   33797251   35152818
  2021-09-01 2021-10-01 2021-11-01 2021-12-01 2022-01-01 2022-02-01 2022-03-01
1     117991     119790     121477     123725     128132     134564     456582
2    4546487    4675758    4774783    5043620    6266939   11116422   12829972
3    1514400    1706516    1722427    1726913    1733835    2825414    5078276
4     255401     316020     367974     457612     639083     884310    3492686
5    4861883    4961128    5011148    5174720    6294745   10039126   11036085
6    6856890    7878555    9140352   10333452   13174530   17543963   19120746
7   39585475   43694428   46163201   48743340   55099948   75570589   79228450
  2022-04-01 2022-05-01 2022-06-01 2022-07-01 2022-08-01 2022-09-01 2022-10-01
1    1400358    2024284    2097282    2137169    2265424    2510703    2762150
2   14719394   16504791   17440232   18610011   21059545   21888255   22500346
3    6614278    7910179    8876113    9355427   12935010   19116684   21329519
4   13639915   17295733   18129313   18379552   19932439   23417425   24819611
5   11551574   11896152   12360256   12818184   13226579   13342530   13422984
6   21379545   22214004   22492903   22941360   23515928   23738035   23893496
7   80252748   81483804   84556267   87832253   91515236   94659072   96369625
  2022-11-01 2022-12-01 2023-01-01 2023-02-01 2023-03-01
1    2959481    3764783    4612203    4903498    4903524
2   23531023   24260660   25143705   25453789   25576852
3   22389872   24933509   29321601   32610584   33241180
4   25670407   27208800   29139535   30213928   30533573
5   13511768   13595504   13684258   13731478   13763336
6   24122922   24251636   24365688   24507372   24603450
7   97540736   98903928  100769628  102479379  103533872

Create death.case data.frame

deaths_raw %>% 
  filter(`Country/Region` %in% c("China", "Italy", "Japan", "United Kingdom", "US", "Korea, South",
                                 "Spain")) %>%
  select(-c(`Province/State`, Lat, Long)) %>% 
  group_by(`Country/Region`) %>% summarise_all(sum) -> test

names(test)[1]<-"country"

melt(data = test, id.vars = "country", measure.vars = names(test)[-1]) %>% 
  separate(variable, into = c("mon", "day", "year"), sep='/', extra = "merge") %>% 
  filter(day %in% c(1)) %>%
  arrange(mon, day) %>% 
  mutate(date=as.Date(with(.,paste(mon, day, year, sep="/")), format = "%m/%d/%y")) %>% 
  dcast(country ~ date) -> df.death.case
df.death.case
         country 2020-02-01 2020-03-01 2020-04-01 2020-05-01 2020-06-01
1          China        259       2872       3332       4698       4708
2          Italy          0         34      13155      28236      33475
3          Japan          0          6         72        510        900
4   Korea, South          0         17        165        250        272
5          Spain          0          0       9387      24543      27127
6 United Kingdom          1          3       6070      39849      52768
7             US          0          1       6996      68518     108624
  2020-07-01 2020-08-01 2020-09-01 2020-10-01 2020-11-01 2020-12-01 2021-01-01
1       4713       4737       4797       4813       4814       4830       4884
2      34788      35146      35491      35918      38826      56361      74621
3        976       1013       1314       1583       1776       2193       3541
4        282        301        326        416        468        526        942
5      28364      28445      29152      31973      35878      45511      50837
6      56338      57454      57995      58946      64667      78184      95917
7     128134     155059     183855     206852     231054     273099     352844
  2021-02-01 2021-03-01 2021-04-01 2021-05-01 2021-06-01 2021-07-01 2021-08-01
1       4966       5011       5023       5031       5031       5033       5047
2      88845      97945     109847     121033     126221     127587     128068
3       5833       7948       9194      10326      13160      14808      15198
4       1435       1606       1737       1833       1965       2024       2099
5      59081      69609      75541      78216      79983      80883      81486
6     132799     148935     153012     154085     154509     155010     156941
7     448381     513045     549448     572904     590904     600972     609715
  2021-09-01 2021-10-01 2021-11-01 2021-12-01 2022-01-01 2022-02-01 2022-03-01
1       5056       5069       5081       5089       5103       5119       5901
2     129290     130973     132120     133931     137513     146925     155000
3      16138      17685      18274      18361      18392      18885      23908
4       2303       2504       2874       3705       5694       6787       8266
5      84472      86463      87368      88080      89405      93633      99883
6     160317     164780     169438     173903     178046     184840     188681
7     639812     699021     746135     781422     825870     892252     952086
  2022-04-01 2022-05-01 2022-06-01 2022-07-01 2022-08-01 2022-09-01 2022-10-01
1      12869      14697      14899      14928      15052      15251      15719
2     159537     163612     166756     168425     172207     175663     177130
3      28202      29605      30659      31302      32707      40245      45023
4      16929      22958      24212      24562      25084      26940      28489
5     102541     104456     106493     108111     110719     112600     114179
6     193232     198276     200347     201869     205574     207875     209346
7     983972     996109    1007741    1017872    1030654    1046956    1059542
  2022-11-01 2022-12-01 2023-01-01 2023-02-01 2023-03-01
1      15965      16001      17167      97668     101051
2     179101     181098     184642     186833     188094
3      46817      49834      57521      68407      72494
4      29239      30621      32272      33522      34003
5     115078     115901     117095     118434     119380
6     212435     214234     217175     220129     220721
7    1070821    1081153    1092779    1109996    1120897

Create Matrix for confirmed and death cases

# country.name<-c("China","Italy","Japan","Korea","Spain","UK","US")  
country.name <- unlist(df.conf.case[c(1)])

#str(df.conf.case)
m.conf.case <- as.matrix(df.conf.case[-1])
row.names(m.conf.case) <- country.name

m.death.case <- as.matrix(df.death.case[-1])
row.names(m.death.case) <- country.name

m.death.rate <- round(m.death.case/m.conf.case, 2)
  • Matrix for confirmed case: m.conf.case

  • Matrix for death case: m.death.case

m.conf.case
               2020-02-01 2020-03-01 2020-04-01 2020-05-01 2020-06-01
China               11891      79932      84002      86850      87520
Italy                   2       1694     110574     207428     233197
Japan                  20        259       2535      14558      16778
Korea, South           12       3736       9887      10780      11541
Spain                   1         84     104118     215216     239638
United Kingdom          2         94      43755     183500     258979
US                      8         32     227903    1115972    1809384
               2020-07-01 2020-08-01 2020-09-01 2020-10-01 2020-11-01
China               88344      91690      94363      95568      97250
Italy              240760     247832     270189     317409     709335
Japan               18732      37790      69018      84212     101936
Korea, South        12904      14366      20449      23952      26732
Spain              249659     288522     470973     778607    1185678
United Kingdom     285276     305558     339403     462780    1038056
US                2698127    4605921    6088458    7292562    9254490
               2020-12-01 2021-01-01 2021-02-01 2021-03-01 2021-04-01
China               99336     102649     107902     109034     110169
Italy             1620901    2129376    2560957    2938371    3607083
Japan              150857     239005     392533     433334     477691
Korea, South        35163      62593      78844      90372     104194
Spain             1656444    1928265    2822805    3204531    3291394
United Kingdom    1647165    2549671    3846807    4194287    4364544
US               13866746   20397398   26482919   28814420   30656330
               2021-05-01 2021-06-01 2021-07-01 2021-08-01 2021-09-01
China              111325     112329     113614     115473     117991
Italy             4035617    4220304    4260788    4355348    4546487
Japan              598754     749126     801337     936852    1514400
Korea, South       123240     141476     158549     201002     255401
Spain             3524077    3682778    3821305    4447044    4861883
United Kingdom    4434156    4506331    4844879    5907641    6856890
US               32516226   33407540   33797251   35152818   39585475
               2021-10-01 2021-11-01 2021-12-01 2022-01-01 2022-02-01
China              119790     121477     123725     128132     134564
Italy             4675758    4774783    5043620    6266939   11116422
Japan             1706516    1722427    1726913    1733835    2825414
Korea, South       316020     367974     457612     639083     884310
Spain             4961128    5011148    5174720    6294745   10039126
United Kingdom    7878555    9140352   10333452   13174530   17543963
US               43694428   46163201   48743340   55099948   75570589
               2022-03-01 2022-04-01 2022-05-01 2022-06-01 2022-07-01
China              456582    1400358    2024284    2097282    2137169
Italy            12829972   14719394   16504791   17440232   18610011
Japan             5078276    6614278    7910179    8876113    9355427
Korea, South      3492686   13639915   17295733   18129313   18379552
Spain            11036085   11551574   11896152   12360256   12818184
United Kingdom   19120746   21379545   22214004   22492903   22941360
US               79228450   80252748   81483804   84556267   87832253
               2022-08-01 2022-09-01 2022-10-01 2022-11-01 2022-12-01
China             2265424    2510703    2762150    2959481    3764783
Italy            21059545   21888255   22500346   23531023   24260660
Japan            12935010   19116684   21329519   22389872   24933509
Korea, South     19932439   23417425   24819611   25670407   27208800
Spain            13226579   13342530   13422984   13511768   13595504
United Kingdom   23515928   23738035   23893496   24122922   24251636
US               91515236   94659072   96369625   97540736   98903928
               2023-01-01 2023-02-01 2023-03-01
China             4612203    4903498    4903524
Italy            25143705   25453789   25576852
Japan            29321601   32610584   33241180
Korea, South     29139535   30213928   30533573
Spain            13684258   13731478   13763336
United Kingdom   24365688   24507372   24603450
US              100769628  102479379  103533872
m.death.case
               2020-02-01 2020-03-01 2020-04-01 2020-05-01 2020-06-01
China                 259       2872       3332       4698       4708
Italy                   0         34      13155      28236      33475
Japan                   0          6         72        510        900
Korea, South            0         17        165        250        272
Spain                   0          0       9387      24543      27127
United Kingdom          1          3       6070      39849      52768
US                      0          1       6996      68518     108624
               2020-07-01 2020-08-01 2020-09-01 2020-10-01 2020-11-01
China                4713       4737       4797       4813       4814
Italy               34788      35146      35491      35918      38826
Japan                 976       1013       1314       1583       1776
Korea, South          282        301        326        416        468
Spain               28364      28445      29152      31973      35878
United Kingdom      56338      57454      57995      58946      64667
US                 128134     155059     183855     206852     231054
               2020-12-01 2021-01-01 2021-02-01 2021-03-01 2021-04-01
China                4830       4884       4966       5011       5023
Italy               56361      74621      88845      97945     109847
Japan                2193       3541       5833       7948       9194
Korea, South          526        942       1435       1606       1737
Spain               45511      50837      59081      69609      75541
United Kingdom      78184      95917     132799     148935     153012
US                 273099     352844     448381     513045     549448
               2021-05-01 2021-06-01 2021-07-01 2021-08-01 2021-09-01
China                5031       5031       5033       5047       5056
Italy              121033     126221     127587     128068     129290
Japan               10326      13160      14808      15198      16138
Korea, South         1833       1965       2024       2099       2303
Spain               78216      79983      80883      81486      84472
United Kingdom     154085     154509     155010     156941     160317
US                 572904     590904     600972     609715     639812
               2021-10-01 2021-11-01 2021-12-01 2022-01-01 2022-02-01
China                5069       5081       5089       5103       5119
Italy              130973     132120     133931     137513     146925
Japan               17685      18274      18361      18392      18885
Korea, South         2504       2874       3705       5694       6787
Spain               86463      87368      88080      89405      93633
United Kingdom     164780     169438     173903     178046     184840
US                 699021     746135     781422     825870     892252
               2022-03-01 2022-04-01 2022-05-01 2022-06-01 2022-07-01
China                5901      12869      14697      14899      14928
Italy              155000     159537     163612     166756     168425
Japan               23908      28202      29605      30659      31302
Korea, South         8266      16929      22958      24212      24562
Spain               99883     102541     104456     106493     108111
United Kingdom     188681     193232     198276     200347     201869
US                 952086     983972     996109    1007741    1017872
               2022-08-01 2022-09-01 2022-10-01 2022-11-01 2022-12-01
China               15052      15251      15719      15965      16001
Italy              172207     175663     177130     179101     181098
Japan               32707      40245      45023      46817      49834
Korea, South        25084      26940      28489      29239      30621
Spain              110719     112600     114179     115078     115901
United Kingdom     205574     207875     209346     212435     214234
US                1030654    1046956    1059542    1070821    1081153
               2023-01-01 2023-02-01 2023-03-01
China               17167      97668     101051
Italy              184642     186833     188094
Japan               57521      68407      72494
Korea, South        32272      33522      34003
Spain              117095     118434     119380
United Kingdom     217175     220129     220721
US                1092779    1109996    1120897

Access to the matrix

  • UK’s total confirmed cases on 2021-10-01

  • South Korea’s total confirmed cases on 2021-10-01

  • China’s total confirmed cases on 2021-10-01

  • South Korea’s increasing confirmed cases on 2021-10-01 compared to the previous month

  • Japan’s increasing confirmed cases on 2021-10-01 compared to the previous month


Create three vectors for the next step:


Population vector

Country names vector. Created to give a name to the population vector

country.name
        country1         country2         country3         country4 
         "China"          "Italy"          "Japan"   "Korea, South" 
        country5         country6         country7 
         "Spain" "United Kingdom"             "US" 

Vector inputting the population numbers of the selected countries in order

pop<-c(1439323776, 60461826, 126476461, 51269185, 46754778, 67886011, 331002651)
pop
[1] 1439323776   60461826  126476461   51269185   46754778   67886011  331002651
names(pop)
NULL

In the pop vector, specify which country’s population has each population using the names() function.

pop<-c(1439323776, 60461826, 126476461, 51269185, 46754778, 67886011, 331002651)
names(pop)<-country.name
pop
         China          Italy          Japan   Korea, South          Spain 
    1439323776       60461826      126476461       51269185       46754778 
United Kingdom             US 
      67886011      331002651 

GDP vector

Likewise, the names() function provides information about which country the GDP corresponds to.

# round(m.conf.case/pop*1000, 2)
country.name<-c("China","Italy","Japan","Korea","Spain","UK","US")  
GDP<-c(12237700479375,
1943835376342,
4872415104315,
1530750923149,
1314314164402,
2637866340434,
19485394000000)
names(GDP)<-country.name

GDP
       China        Italy        Japan        Korea        Spain           UK 
1.223770e+13 1.943835e+12 4.872415e+12 1.530751e+12 1.314314e+12 2.637866e+12 
          US 
1.948539e+13 

Pop density

Let’s create a population density vector for the selected countries.

country.name<-c("China","Italy","Japan","Korea","Spain","UK","US")  
pop.density<-c(148, 205, 347, 530, 94, 275, 36)
names(pop.density)<-country.name

Check the created vectors

pop
         China          Italy          Japan   Korea, South          Spain 
    1439323776       60461826      126476461       51269185       46754778 
United Kingdom             US 
      67886011      331002651 
GDP
       China        Italy        Japan        Korea        Spain           UK 
1.223770e+13 1.943835e+12 4.872415e+12 1.530751e+12 1.314314e+12 2.637866e+12 
          US 
1.948539e+13 
pop.density
China Italy Japan Korea Spain    UK    US 
  148   205   347   530    94   275    36 

Let’s visualize the GDP of each country.

barplot(GDP)

Sort by largest GDP

barplot(sort(GDP))

barplot(sort(GDP, decreasing = T))

Let’s think..

How about Bar graph of GDP per capita?

We have

  • GDP vector

  • Population vector

We know

  • barplot()

  • Vector calculation

  • Sort()

  • Decreasing=T option


matplot

matplot(m.conf.case)

matplot(t(m.conf.case))


matplot(t(m.conf.case))

matplot(t(m.conf.case), type='b')

matplot(t(m.conf.case), type='b', pch=15:20)

matplot(t(m.conf.case), type='b', pch=15:20, col=c(1:6, 8), 
        ylab="Confirmed cases")

matplot(t(m.conf.case), type='b', pch=15:20, col=c(1:6, 8), 
        ylab="Confirmed cases")
legend("topleft", inset=0.01, legend=country.name, pch=15:20, col=c(1:6, 8), horiz=F)


Try the same graph but now use the death rate

Country’s wealth and COVID19

I’m now curious about the relationship between countries’ GDP per capita and the death rate at some points

plot(GDP.pc, m.death.rate[,10])

plot(GDP.pc, m.death.rate[,10],
     ylab = "Death rate")
text(GDP.pc, m.death.rate[,16], row.names(m.death.rate),
     cex = 1, pos = 4, col = "blue")






Increasing rate of confirmed cases

Visualize like an example below (in 5 mins)

Let’s omit US for the clear vision

Let’s also visualize the first three periods and the last (recent) four periods


Can you also do the same visualization for the specific country like Korea, China, and Japan?