Training Programs
Introduction to Data Management
Overview
This course introduces the principles and practice of data management.
Features
We cover the following:
- Motivation for data management
- Data principles, life-cycle, protection and regulation
- Data creation
- The data technology ecosystem
- Presenting with data
Version Control with Git
Overview
We present and demonstrate Git and GitHub, a distributed version control system for source code that facilitate collaboration and version control among software development and data science teams.
Features
This course comprises the following modules:
- Version Control
- Git
- Local Repository
- Remote Repository
Getting Started with R Programming
Overview
We setup R and RStudio and establish the foundation for writting and organizing R programs.
Features
- 9 Videos
- 1 Quiz
- 1 Practice
R Data and Data Structures
Overview
We present R data structures and demonstrate their use in data manipulation.
Features
- 14 Videos
- 1 Quiz
- 1 Practice
R Projects
Overview
We present best practices for setting up and using projects in RStudio for reproducible analysis. We also introduce Git, a version control system built into RStudio.
- 5 Videos
- 1 Quiz
- 1 Practice
R Functions
Functions are reusable blocks that are useful for conducting reproducible analysis.
In this course, we demonstrate the use of functions for summary statistics and also expose you to defining and calling your own functions.
Tibbles
Overview
We present and use tibble, a data structure similar to data frames but with some improvement.
Features
- 2 Videos
- 1 Quiz
- 1 Practice
dplyr
Overview
We present and demonstrate data manipulation with dplyr package.
Features
- 10 Videos
- 1 Quiz
- 1 Practice
ggplot2
Overview
We present and demonstrate ggplot2, R graphics system for publication-ready static data visualization that is based on the grammer of graphics.
Features
- 6 Videos
- 1 Quiz
- 1 Practice
tidyr
Overview
We demonstrate tools in tydyr, a package for tyding messy data, for working with variety of wide and long data formats.
Features
- 1 Videos
- 1 Quiz
- 1 Practice
Numpy
Numpy is Python library for numerical computing.
Numpy provides data structure for representing n-dimensional arrays such as vectors and matrices, the foundation for vectorized implementation of numerical computing.
In the course, we demonstrate creation and use of Numpy arrays in numerical computing.