Link Search Menu Expand Document

Project 03 - Sorting Algorithm Analysis

Due Date: Oct 19, 2020 at 6am

Submission Method:

  • Github Classroom Repo Automatically created when you click on this link

Introduction

In this project, you will create a number of datasets of integers of various sizes and run them against six different sorting algorithms which have been modified to count the number element comparisons performed. The project dictates 5 specific sorting algorithms (see below), and you will be free to choose the 6th one. The five algorithms you must implement are:

  • Bubble Sort
  • Selection Sort
  • Insertion Sort
  • Quick Sort
  • Merge Sort

The algorithm of your choosing must be a comparison-based sort. If you’re unsure if the algorithm you choose falls in this category, ask Dr. Fontenot.

You are free to use implementations of these algorithms that you find on the Internet or in text books. However, you will still need to modify them to keep a count of the number of comparisons that are performed during sorting. Note that this DOES NOT include comparisons that are part of loop control variables for counter controlled loops.

For each dataset you generate, you will execute it against each algorithm in its sorted, reverse sorted, and randomized ordering. So if you have 10 data sets, that means 30 different executions against each of the 6 algorithms.

Generating the Data Sets

The first step in the execution of your project will be to generate datasets. Your project will take one command line argument to indicate the mode for dataset generation:

  • -l will indicate that the project is being executed on your local development environment and can thus support much longer run times
  • -g will indicate that the project is being executed on Github actions and thus needs to only generate small data sets as proof of concept of execution

Local Execution -l

You should generate 10 data sets of increasing size. You’ll want to choose 10 sizes that will demonstrate the relative performance differences well between the different sorting algos. There is no time-limit imposed on this mode of execution except that it needs to be done before the project is due. There will be a practical limit on the data set size you can sort with the O(n^2) sorts. After that size, you don’t need to collect data on them anymore.

Github Actions Execution -g

Generate and run 3 different data sets of increasing size against each sorting algorithm. The max data set size should be around 500 or so. The goal here is to show proof of concept that the sorting algos work and can actually count the number of comparisons properly.

Output

Your project should generate a CSV file. Columns should represent the data sets/orderings, and rows should represent the sorting algorithms. The 2nd command line argument will be the name of the output file.

What You’ll Submit

Your deliverable will be a short paper explaining the results of your analysis. You can import the csv file into Excel or some other spreadsheet program to produce graphs. Your paper should not exceed 2 pages in length (single spaced, 12 pt Times or Arial font). No more than 25% of the vertical space can be taken up with graphs or other visuals. The paper should be submitted to Github in your project repo in PDF format.

Grading

Deliverable Points
Impl. of Sorting algos with counting 40
Proper data set generation 20
Analysis Paper 40

Table of contents