This vignette aims to be a brief introduction and overview of using SQL. The tutorial will illustrate common functions by answering questions around Pixar films and their performance.
Load packages.
library(pixarfilms)
library(dplyr)
library(dbplyr)
#>
#> Attaching package: 'dbplyr'
#> The following objects are masked from 'package:dplyr':
#>
#> ident, sql
Load data into temporary database to interact with.
Develop query.
Let’s execute it now.
DBI::dbGetQuery(conn = con, statement = query)
#> number film release_date run_time film_rating
#> 1 1 Toy Story 9456 81 G
#> 2 2 A Bug's Life 10555 95 G
#> 3 3 Toy Story 2 10919 92 G
#> 4 4 Monsters, Inc. 11628 92 G
#> 5 5 Finding Nemo 12202 100 G
#> 6 6 The Incredibles 12727 115 PG
#> 7 7 Cars 13308 117 G
#> 8 8 Ratatouille 13693 111 G
#> 9 9 WALL-E 14057 98 G
#> 10 10 Up 14393 96 PG
#> 11 11 Toy Story 3 14778 103 G
#> 12 12 Cars 2 15149 106 G
#> 13 13 Brave 15513 93 PG
#> 14 14 Monsters University 15877 104 G
#> 15 15 Inside Out 16605 95 PG
#> 16 16 The Good Dinosaur 16764 93 PG
#> 17 17 Finding Dory 16969 97 PG
#> 18 18 Cars 3 17333 102 G
#> 19 19 Coco 17492 105 PG
#> 20 20 Incredibles 2 17697 118 PG
#> 21 21 Toy Story 4 18068 100 G
#> 22 22 Onward 18327 102 PG
#> 23 23 Soul 18621 100 PG
#> 24 24 Luca 18796 151 N/A
#> 25 25 Turning Red 19062 NA N/A
#> 26 26 Lightyear 19160 NA N/A
#> 27 27 <NA> 19524 155 Not Rated
Here, let’s review the functions of SQL covered above.
SELECT
= text that follows this chooses the columns you
want in your query, and doing SELECT *
will select all
columnssessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] dbplyr_2.5.0 irr_0.84.1 lpSolve_5.6.20 forcats_1.0.0
#> [5] tidyr_1.3.1 patchwork_1.2.0 lubridate_1.9.3 ggplot2_3.5.1
#> [9] dplyr_1.1.4 pixarfilms_0.2.1 readr_2.1.5 rmarkdown_2.28
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.5 beeswarm_0.4.0 xfun_0.47 bslib_0.8.0
#> [5] ggrepel_0.9.5 tzdb_0.4.0 vctrs_0.6.5 tools_4.4.1
#> [9] generics_0.1.3 curl_5.2.1 parallel_4.4.1 tibble_3.2.1
#> [13] fansi_1.0.6 RSQLite_2.3.7 highr_0.11 blob_1.2.4
#> [17] pkgconfig_2.0.3 RColorBrewer_1.1-3 lifecycle_1.0.4 compiler_4.4.1
#> [21] farver_2.1.2 munsell_0.5.1 vipor_0.4.7 htmltools_0.5.8.1
#> [25] sys_3.4.2 buildtools_1.0.0 sass_0.4.9 yaml_2.3.10
#> [29] pillar_1.9.0 crayon_1.5.3 jquerylib_0.1.4 cachem_1.1.0
#> [33] tidyselect_1.2.1 digest_0.6.37 purrr_1.0.2 maketools_1.3.0
#> [37] labeling_0.4.3 fastmap_1.2.0 grid_4.4.1 colorspace_2.1-1
#> [41] cli_3.6.3 magrittr_2.0.3 utf8_1.2.4 withr_3.0.1
#> [45] scales_1.3.0 bit64_4.0.5 ggbeeswarm_0.7.2 timechange_0.3.0
#> [49] bit_4.0.5 hms_1.1.3 memoise_2.0.1 evaluate_0.24.0
#> [53] knitr_1.48 rlang_1.1.4 Rcpp_1.0.13 glue_1.7.0
#> [57] DBI_1.2.3 vroom_1.6.5 jsonlite_1.8.8 R6_2.5.1