loader image
Loading Events

Data Management, Analysis and Graphics with R Course

April 15 @ 8:00 am - April 26 @ 5:00 pm EAT

Data Management, Analysis and Graphics with R

Introduction to data Science with R

  • What is analytics & Data Science?
  • Common Terms in Analytics
  • Analytics vs. Data warehousing, OLAP, MIS Reporting
  • Relevance in industry and need of the hour
  • Types of problems and business objectives in various industries
  • How leading companies are harnessing the power of analytics?
  • Critical success drivers
  • Overview of analytics tools & their popularity
  • Analytics Methodology & problem-solving framework
  • List of steps in Analytics projects
  • Identify the most appropriate solution design for the given problem statement
  • Project plan for Analytics project & key milestones based on effort estimates
  • Build Resource plan for analytics project
  • Why R for data science?

 

Data Importation & Exportation.

  • Introduction R/R-Studio – GUI
  • Concept of Packages – Useful Packages (Base & Other packages)
  • Data Structure & Data Types (Vectors, Matrices, factors, Data frames, and Lists)
  • Importing Data from various sources (txt, dlm, excel, sas7bdata, db, etc.)
  • Database Input (Connecting to database)
  • Exporting Data to various formats)
  • Viewing Data (Viewing partial data and full data)
  • Variable & Value Labels – Date Values

Data Manipulation.

  • Data Manipulation steps
  • Creating New Variables (calculations & Binning)
  • Dummy variable creation
  • Applying transformations
  • Handling duplicates
  • Handling missing’s
  • Sorting and Filtering
  • Subsetting (Rows/Columns)
  • Appending (Row appending/column appending)
  • Merging/Joining (Left, right, inner, full, outer etc)
  • Data type conversions
  • Renaming
  • Formatting
  • Reshaping data
  • Sampling
  • Data manipulation tools
  • Loops (Conditional, iterative loops, apply functions)
  • Arrays
  • R Built-in Functions (Text, Numeric, Date, utility)
  • Numerical Functions
  • Text Functions
  • Date Functions
  • Utilities Functions
  • R User Defined Functions

Data Analysis- Visualization

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis (Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
  • R Packages for Exploratory Data Analysis (dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
  • R Packages for Graphical Analysis (base, ggplot, lattice,etc)

Introduction to Statistics

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests (One sample, independent, paired), Anova, Correlations and Chi-square

Introduction to Predictive Modeling

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & modeling process
  • Popular modeling algorithms
  • Types of Business problems – Mapping of Techniques
  • Different Phases of Predictive Modeling

Data Exportation for Modelling

Data Preparations

  • Need of Data preparation
  • Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable Reduction
  • Variable Reduction Techniques – Factor & PCA Analysis

Segmentation: Solving segmentation problems

  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
  • Behavioral Segmentation Techniques (K-Means Cluster Analysis)
  • Cluster evaluation and profiling – Identify cluster characteristics

Linear Regression

  • Interpretation of results – Implementation on new data
  • Introduction – Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis ,etc)
  • Assess the overall effectiveness of the model
  • Validation of Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
  • Interpretation of Results – Business Validation – Implementation on new data

Logistics Regression.

  • Introduction – Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model (Binary Logistic Model)
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
  • Interpretation of Results – Business Validation – Implementation on new data

Time series Forecasting

  • Introduction – Applications
  • Time Series Components (Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Classification of Techniques (Pattern based – Pattern less)

Way forward After the Training

Participants will develop a work plan through the help of facilitators that stipulates application of skills acquired in improving their organizations. ASPM will continuously monitor implementation progress after the training.

Training Evaluation:

Participants will undertake a simple assessment before the training to gauge knowledge and skills and another assessment after the training in-order to monitor knowledge gained through the training

April 15 @ 8:00 am - April 26 @ 5:00 pm

Organizer

ASPM
+254 737 022726
info@aspm.co.ke

Venue

Nairobi
Nairobi Kenya + Google Map
+254 737 022726

Add a Comment

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

error: Content is protected !!