An essential, yet often under-emphasized step in the data mining process is data preparation. Habitually, people are more inclined to focus on knowledge discovery, but without sufficient preparation of the data, return on efforts will be limited. Without adequate skill and knowledge, preparing data for modeling can lead to less than adequate modeling results.
This class offers in-depth coverage of data preparation techniques and a step-by-step approach through a variety of tools while providing practical illustrations using real data sets. The hands-on exercises will anchor the learned concepts and offer valuable first-hand experience in cleaning, filtering, and preparing the data for mining and predictive or descriptive modeling. The goal is to transform the datasets so that their information content is best exposed to the mining tool.
Topics include:
- Prerequisites to good data preparation
- Dealing with variables
- Sparcity
- Monotonicity
- Increasing dimensionality
- Anachronisms
- Missing values
- Outliers
- Normalization, transformation, feature extraction, and feature reduction
- Building mineable datasets
- Data separation
- Dealing with imbalanced data
Practical experience:
- Hands-on data mining projects
Software: WEKA is used for class assignments. There is no additional cost for this product.
Course typically offered: Online in Winter and Summer
Prerequisites: Fundamentals of Data Mining or equivalent experience required.
Next Steps: Upon completion of this course, consider taking Data Mining: Advanced Concepts and Algorithms.
More Information: For more information about this course, please contact unex-techdata@ucsd.edu.
Course Number: CSE-41261
Credit: 2.00 unit(s)
Related Certificate Programs: Data Mining for Advanced Analytics
+ Expand All
-
7/11/2023 - 9/9/2023
$625
Online
-
-
-
CLASS TYPE:
Online Asynchronous.
This course is entirely web-based and to be completed asynchronously between the published course start and end dates. Synchronous attendance is NOT required.
You will have access to your online course on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.
Sipes, Tamara, Data Mining Specialist
Tamara Sipes is a data mining specialist. She uses her data mining expertise to analyze data, select meaningful attributes and build predictive models that discover significant trends and relationships. Her work has led to patent awards for clients in biotechnology and other industries, and she has published research in the areas of data mining and learning technologies.
-
TEXTBOOKS:
No information available at this time.
-
POLICIES:
No refunds after: 7/17/2023.
-
7/11/2023 - 9/9/2023
extensioncanvas.ucsd.edu
You will have access to your course materials on the published start date OR 1 business day after your enrollment is confirmed if you enroll on or after the published start date.
There are no sections of this course currently scheduled. Please contact the Science & Technology department at 858-534-3229 or unex-sciencetech@ucsd.edu for information about when this course will be offered again.