MIS 661 Topic 1 DQ 1

.docx

School

Grand Canyon University *

*We aren’t endorsed by this school

Course

661

Subject

Information Systems

Date

Feb 20, 2024

Type

docx

Pages

2

Uploaded by MasterTitanium11775

Report
Data mining should be viewed as a process. As with all good statistical analyses, one needs to be clear about the purpose of the analysis. Just to "mine data" without a clear purpose, without an appreciation of the subject area, and without a modeling strategy will usually not be successful. Describe a data mining process that is popular in today's landscape that could be an alternative to the CRISP-DM Lifecycle. CRISP-DM is widely adopted and has extensive documentation and support from the data mining community. It is generally seen as a practical and effective methodology for data mining projects. CRISP-DM covers the entire data mining project lifecycle, including understanding business goals, data collection and preparation, model building, evaluation, and deployment. A data mining process that is popular in today's landscape that could be an alternative to the CRISP-DM Lifecycle is SEMMA. While CRISP-DM means: Cross-industry standard process for data mining; SEMMA means: Sample, explore, modify, model, assess. SEMMA was developed by SAS (a software company) as a framework for their data mining software. It focuses primarily on the modeling phase and is more specific to SAS’s software suite. However, it has also been used more broadly in the context of data analysis and modeling. SEMMA and CRISP-DM are both process models used in the field of data mining and machine learning to guide the steps involved in developing predictive models and extracting useful insights from data. While they share some similarities, they also have distinct differences. SEMMA outlines five key phases:  1. sample,  2. explore,  3. modify,  4. model,  and  5. assess. SEMMA focuses primarily on the modeling phase, offering guidance on data sampling, exploration, modification, modeling, and model assessment. SEMMA is more specific to SAS software and is often used as a companion to other, more comprehensive methodologies like CRISP-DM. CRISP-DM is a more comprehensive and widely accepted data mining process model that covers the entire project lifecycle.  SEMMA, on the other hand, is a more specialized framework, primarily focusing on the modeling phase and is closely associated with SAS software.  The choice between the two depends on the specific needs and tools of a given project, with CRISP-DM as a more general and flexible approach. SEMMA, while useful for model- building within the SAS environment, may be less familiar and less widely adopted outside of the SAS user base. SEMMA data mining methodology can be used to solve a wide range of business problems, including fraud identification, customer retention and turnover, database marketing, customer loyalty, bankruptcy forecasting, market segmentation, as well as risk, affinity, and portfolio analysis.
Starburst. (n.d.). SEMMA vs CRISP-DM https://www.starburst.io/learn/data-fundamentals/semma-crisp-dm/ Hotz, N. (2023). Data Science Process Alliance. What is SEMMA? https://www.datascience-pm.com/semma/#:~:text=You%20can%20use%20the%20SEMMA, %2C%20affinity%2C%20and%20portfolio%20analysis .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help