{ "cells": [ { "cell_type": "code", "execution_count": 13, "id": "lesbian-horror", "metadata": {}, "outputs": [], "source": [ "# All libraries needed for this project\n", "using LinearAlgebra\n", "import CSV\n", "import DataFrames\n", "import Flux" ] }, { "cell_type": "markdown", "id": "middle-tattoo", "metadata": {}, "source": [ "# Exercice 1: Regression & Regularization" ] }, { "cell_type": "code", "execution_count": 4, "id": "color-scale", "metadata": {}, "outputs": [], "source": [ "# Training data\n", "data_re = CSV.read(\"reg_dataset.csv\", DataFrames.DataFrame);" ] }, { "cell_type": "markdown", "id": "historical-gossip", "metadata": {}, "source": [ "Perform linear regression to predict exam results from the available features. You should use the library Flux. (Use train_size=0.8)\n", "\n", "1. What is the mean average error?\n", "2. What can you tell about the feature importance?\n", "3. Use regularization to get rid of the dependance of non-usefull variables.\n", "4. What is the mean average error with regularization? Do you notice an improvement? Comment." ] }, { "cell_type": "markdown", "id": "accompanied-frame", "metadata": {}, "source": [ "# Exercice 2: Classification" ] }, { "cell_type": "code", "execution_count": 12, "id": "offensive-destruction", "metadata": {}, "outputs": [], "source": [ "# Training data\n", "data = CSV.read(\"class_dataset.csv\", DataFrames.DataFrame);" ] }, { "cell_type": "markdown", "id": "arctic-harvey", "metadata": {}, "source": [ "You are given different pelvic parameters used in radiological assessment of the lumbosacral spine. One of the columns ('class') of the dataframe tells if the pelvic is normal/abnormal.\n", "\n", "With all the features given, perform binary classification with knn & logistic regression in order to infer if the pelvic is normal or not. (Use train_size=0.4)" ] }, { "cell_type": "markdown", "id": "perceived-guatemala", "metadata": {}, "source": [ "## Exercice 2.1: k-Nearest Neighbors (KNN) implementation" ] }, { "cell_type": "markdown", "id": "tropical-singer", "metadata": {}, "source": [ "1. Implement a working KNN algorithm (not a KNN from a library)\n", "2. What accuracy can you get with this KNN model?\n", "3. What value would you choose for the hyper-parameter n_neigh?\n", "4. What could you do in order to improve your implementation of the KNN? No code, just ideas." ] }, { "cell_type": "markdown", "id": "outdoor-mechanics", "metadata": {}, "source": [ "## Exercice 2.2: logistic regression implementation" ] }, { "cell_type": "markdown", "id": "interested-flooring", "metadata": {}, "source": [ "1. Train a logistic regression algorithm\n", "2. What accuracy can you get with the logistic regression?\n", "3. How does this compare to the results obtained with the KNN?\n", "4. How could you minimize the risk that the algorithm classifies the cell as normal while it is not? Think about statistical inference and explain." ] }, { "cell_type": "code", "execution_count": null, "id": "clean-bleeding", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.4.1", "language": "julia", "name": "julia-1.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.1" } }, "nbformat": 4, "nbformat_minor": 5 }