Kembali ke Kursus
AI014 Professional

An Introduction to R Programming

This course is a comprehensive introduction to the R language environment, covering core topics from basic numeric vector operations, object attributes, array and matrix processing, list and data frame management, to statistical modeling and high-quality graphics production. It is highly suitable as an introductory text for statistical analysis and data science.

4.9
30h
716 siswa
0 suka
Kecerdasan Buatan

Gambaran Umum Kursus

📚 Content Summary

This course is a comprehensive introduction to the R language environment, covering core topics ranging from basic numerical vector operations, object attributes, and array/matrix handling to list/data frame management, statistical modeling, and high-quality graphics production. It is suitable as an introductory textbook for statistical analysis and data science.

Master the core of the R language and open the door to statistical computing and data visualization.

Author: R Development Core Team

Acknowledgments: 本手册由 R 开发核心小组维护。中文版感谢 Shigeru MASE 的日文翻译基础,以及 Dr. ZP Li, Dr. Rui Li 等中文翻译团队的贡献。

🎯 Learning Objectives

  1. Initialize R sessions, navigate the help system, and apply basic syntax rules (case sensitivity, assignments, and comments).
  2. Distinguish between and create logical vectors, character vectors, and handle missing values (NA and NaN).
  3. Utilize four distinct indexing methods to select, exclude, or modify specific subsets of data.
  4. Identify and modify the intrinsic attributes (mode and length) of R objects.
  5. Utilize the class() and attr() functions to manage object metadata and data structures.
  6. Create and manipulate factors and ordered factors to represent categorical data.
  7. Define and construct arrays and matrices using dimension vectors and the array() function.
  8. Apply advanced indexing techniques, including the use of index matrices to extract or modify specific elements.
  9. Execute linear algebra operations including outer products, generalized transposes, and matrix inversions.
  10. Construct and Modify Lists: Create named and unnamed lists and combine them using specific R syntax.

🔹 Lesson 1: Introduction to R and Vector Basics

Overview: This lesson introduces the foundational environment of R, covering its basic syntax, command execution, and help systems. It further explores specialized vector types—logical, character, and missing values—and provides detailed methods for selecting and modifying data subsets through index vectors.

Learning Outcomes:

  • Initialize R sessions, navigate the help system, and apply basic syntax rules (case sensitivity, assignments, and comments).
  • Distinguish between and create logical vectors, character vectors, and handle missing values (NA and NaN).
  • Utilize four distinct indexing methods to select, exclude, or modify specific subsets of data.

🔹 Lesson 2: Object Attributes and Factor Handling

Overview: This lesson covers the fundamental properties of R objects, specifically their intrinsic attributes like mode and length, and how these can be queried or modified. It also explores "factors"—a specialized data structure for handling categorical variables—and demonstrates how to use the tapply() function to perform grouped statistical analysis across factor levels.

Learning Outcomes:

  • Identify and modify the intrinsic attributes (mode and length) of R objects.
  • Utilize the class() and attr() functions to manage object metadata and data structures.
  • Create and manipulate factors and ordered factors to represent categorical data.

🔹 Lesson 3: Arrays, Matrices, and Linear Algebra

Overview: This lesson explores R’s robust capabilities for handling multi-dimensional data through arrays and matrices. Students will learn how to define data structures using dimension vectors, perform complex indexing, and execute essential linear algebra operations—such as matrix multiplication, inversion, and decompositions—critical for statistical computing and data analysis.

Learning Outcomes:

  • Define and construct arrays and matrices using dimension vectors and the array() function.
  • Apply advanced indexing techniques, including the use of index matrices to extract or modify specific elements.
  • Execute linear algebra operations including outer products, generalized transposes, and matrix inversions.

🔹 Lesson 4: Data Management: Lists, Data Frames, and I/O

Overview: This lesson covers the fundamentals of handling complex data structures and external data in R. It focuses on Lists—flexible containers that hold components of different types—and the practicalities of Input/Output (I/O), including reading external files into list structures, accessing built-in datasets from packages, and using interactive editing tools to modify data.

Learning Outcomes:

  • Construct and Modify Lists: Create named and unnamed lists and combine them using specific R syntax.
  • Component Access: Distinguish between and apply different indexing methods ([[ ]], [ ], and $) to retrieve list data.
  • External Data Input: Use the scan() function to read data from external files into structured lists or matrices.

🔹 Lesson 5: Probability Distributions and Statistical Tests

Overview: This lesson provides a comprehensive guide to handling probability distributions and conducting statistical inference in R. Students will learn to utilize R’s standardized prefix system (d, p, q, r) for distribution functions, generate descriptive statistics, and visually assess data using empirical cumulative distribution functions (ecdf) and Q-Q plots.

Learning Outcomes:

  • Master the R distribution nomenclature (prefixes d, p, q, r) and apply them to standard distributions like Normal, T, and F.
  • Construct and interpret visual diagnostic tools, specifically empirical cumulative distribution functions (ecdf) and Quantile-Quantile (Q-Q) plots, to evaluate distributional fit.
  • Execute and differentiate between parametric and non-parametric tests, including Welch t-tests, Shapiro-Wilk normality tests, and Kolmogorov-Smirnov tests.

🔹 Lesson 6: Program Control and Iterative Logic

Overview: This lesson covers the fundamental mechanisms for controlling the flow of execution in R. It focuses on grouping multiple expressions into single units and utilizing control statements—including conditional branching (if-else) and various looping structures (for, repeat, and while)—to automate data analysis tasks and handle complex logic.

Learning Outcomes:

  • Group multiple R expressions into a single statement using braces.
  • Implement conditional logic to execute specific code blocks based on logical criteria.
  • Construct iterative loops to automate repetitive operations over data structures like vectors and lists.

🔹 Lesson 7: Custom Function Development and Scoping

Overview: This lesson explores the transition from using R as an interactive calculator to using it as a programming language by developing custom functions. It covers function definition syntax, argument handling, lexical scoping rules, and the foundational concepts of R’s S3 object-oriented system through generic functions and methods.

Learning Outcomes:

  • Create and Invoke Custom Functions: Define functions with formal parameters and custom binary operators.
  • Manage Arguments and Scoping: Distinguish between positional and keyword argument matching and explain how lexical scoping manages local and free variables.
  • Implement Mutable State and Custom Environments: Use closures and the super-assignment operator to maintain state and customize the R environment via startup/session functions.

🔹 Lesson 8: Statistical Modeling: Linear and Non-linear

Overview: This lesson explores the comprehensive suite of tools in R for statistical modeling beyond simple linear regression. It covers the extraction of model information through generic functions, the comparison of models via ANOVA, and the fitting of Generalized Linear Models (GLMs) for binary and count data, alongside nonlinear modeling techniques.

Learning Outcomes:

  • Use generic R functions to extract, summarize, and visualize information from fitted models.
  • Perform model comparisons using ANOVA tables and update existing models using efficient syntax.
  • Fit Generalized Linear Models (GLMs) using appropriate families and link functions (e.g., Logit, Probit, Poisson).

🔹 Lesson 9: Visualizing Data with High and Low Level Graphics

Overview: This lesson covers the comprehensive graphical capabilities of R, distinguishing between high-level plotting functions that create complete charts and low-level commands that add specific elements to existing displays. Students will learn to manipulate graphical parameters for precise aesthetic control and manage multiple figure environments.

Learning Outcomes:

  • Distinguish between and implement high-level (e.g., plot(), hist()) and low-level (e.g., points(), lines()) graphical functions.
  • Apply and manage graphical parameters using both permanent (par()) and temporary (function-level) conditioning.
  • Coordinate complex layouts, including multivariate data displays and multiple-figure environments.

🔹 Lesson 10: Package Ecosystem and Environment Configuration

Overview: This lesson explores the structural foundations of R, focusing on the package ecosystem, the role of CRAN, and the mechanism of namespaces for managing functions. It also provides a practical roadmap for executing R through various interfaces and mastering environment configuration through command-line arguments and keyboard shortcuts.

Learning Outcomes:

  • Understand the relationship between packages, namespaces, and the CRAN repository system.
  • Execute a comprehensive "sample session" involving data manipulation, statistical modeling, and complex mathematical plotting.
  • Configure the R startup environment using command-line flags and environment variables.