final 5080_2024

.docx

School

Austin Peay State University *

*We aren’t endorsed by this school

Course

5080

Subject

Computer Science

Date

Jan 9, 2024

Type

docx

Pages

6

Uploaded by JusticeGalaxyHorse20

Report
Q1. For the following 200, 400, 800, 1000, 2000 1. Calculate the mean and variance 2. Normalize the above group by min-max min=0, max=10 3. In z-score normalization what value should the first 200 be transformed to? 1. Mean = 880, Variance= 492,000 2. Min_max= 0, Max=10 Normalized by Min_Max = (X - Min /(Max-Min)) * (New_Max – New_Min) + New_Min (X-Min/(Max-Min) x (10 – 0) + 0 (200 – 0/2000-200) x (10 – 0) + 0 = 1.11 Value Min_Max 200 0 400 1.11 800 3.33 1000 4.44 2000 10 3. Z-score normalization Z-Score= x – mean/ σ 2 For the value (200) = 200 880 492,000 = -0.969452112 Q2. A database has four transactions. Let min sup = 60% and min conf = 80%. (a) At the granularity of item category (e.g., item i could be “ Milk" ), for the following rule template, [s,c] ꓯꓯ transaction ϵ , buys ( X, item 1 ) ^ buys ( X, item 2 ) => buys ( X, item 3 ) [ s; c ] list the frequent k -itemset for the largest k and all of the strong association rules (with their support s and confidence c ) containing the frequent k -itemset for the largest k . Answer: The value of k =3 and the frequent 3-itemset is {Bread, Milk, Cheese}. These are the Rules. Bread ^ Cheese => Milk, [75%, 100%] Cheese ^ Milk => Bread, [75%, 100%] Cheese => Milk ^ Bread, [75%, 100%] (b) At the granularity of brand-item category (e.g., itemi could be \Sunset-Milk" ), for the following rule template, [s, c] customer ; buys ( X; item 1) ^ buys ( X; item 2) )buys ( X; item 3) list the frequent k -itemset for the largest k . The maximum value of k =3. The frequent 3-itemset includes {(Wonder-Bread, Dairyland-Milk, Tasty-Pie), (Wonder-Bread, Sunset-Milk, Dairyland-Cheese)}. Q3. Suppose you are requesting to classify microarray with 100 tissues and 10000 genes. Which of the following algorithms would you recommend and why? 1. Decision tree induction 2. Piece-wise linear regression 3. SVM 4. Associative classification 5. Genetic algorithm 6. Bayesian clief network Answer: I suggest opting for Support Vector Machines (SVM) because this algorithm is adept at accommodating penalties arising from the adverse impact of genes on the classification process.
Additionally, SVM excels in managing high-dimensional data, addressing non-linear relationships, and delivering distinct separations with the incorporation of penalties for misclassifications. Q4. The Table below shows the Decision…..see text
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help