Capstone in Applied Data Science with R
CSE-41410
Showcase your R expertise with a real-world, AI-powered analytics project
This R Analytics Capstone Project course integrates skills from the three required courses in the "R for Data Analytics" program, enabling students to create a professional portfolio project (e.g. Quarto book, R-Shiny interactive visualization, or similar) using AI assistance. Focusing on data collection, cleaning, analysis, and visualization from provided sources, this course prepares professionals, researchers, and students for real-world analytics challenges.
Course Highlights:
- Introduction to Quarto Book Creation: Learning to structure and format a Quarto book for project documentation, including text, code, and output integration.
- Data Collection from Multiple Sources: Selecting and importing data from provided datasets, preparing for analysis
- Data Cleaning and Preparation: Techniques to handle missing values, inconsistencies, and transformations in R
- Statistical Analysis Selection: Choosing and implementing a statistical method (e.g., regression, t-test) based on project needs
- Table and Figure Generation: Creating tables with "flextable" and visualizations to present results clearly
- Programming Skills: Writing custom functions and loops to automate and optimize data tasks
- Project Summary and Reporting: Crafting a comprehensive report in Quarto, integrating analysis, visuals, and conclusions
- AI-Assisted Project Development: Utilizing AI tools (e.g., for code suggestions, report polishing) to enhance project quality
Course Learning Outcomes:
- Synthesize skills from prior R courses to design and execute a comprehensive data analytics project, documented in a professional R project
- Apply data collection, cleaning, and statistical analysis techniques to real-world datasets, selecting from provided sources
- Create publication-ready tables and figures to communicate findings effectively
- Develop reusable R functions and loops to streamline data processing, enhancing project efficiency
- Produce a detailed summary report, leveraging AI tools to refine analysis and presentation for professional use
Course Typically Offered: Online in spring and fall quarters
Prerequisites: Students must complete all three required courses in the R for Data Analytics certificate in order to enroll into this course which includes CSE-41097 Introduction to R Programming, CSE-41198 Introduction to Statistics using R, and CSE-41408 Advanced Data Wrangling and Visualization in R.
Next Step: After completing this course, consider taking CSE-41396 Practical R for the Pharmaceutical Industry to continue learning.
Contact: For more information about this course, please email unex-techdata@ucsd/edu.
Course Information
Course sessions
Section ID:
Class type:
This course is entirely web-based and to be completed asynchronously between the published course start and end dates. Synchronous attendance is NOT required.
You will have access to your online course on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.
Textbooks:
All course materials are included unless otherwise stated.
Policies:
- No refunds after: 4/6/2026
Schedule:
Instructor: George Schoeffel
George Schoeffel is an accomplished instructor with extensive expertise in business intelligence and data analytics. He began his career researching how statistical agencies could generate synthetic data as a solution to decreasing survey response rates. Over the years, he has applied his diverse skills across various sectors, including financial services, professional sports, and as an advisor to government agencies.
George earned his dissertation from Georgetown University’s McCourt School of Public Policy. His passion for artificial intelligence, continues to drive his research and teaching. George is dedicated to helping students develop cutting-edge skills in data analytics and AI, preparing them for the challenges of the modern digital landscape.