Coding Environment

We will be applying the concepts in this course to hands-on coding applications using real world data. For this, you need to have Python 3.8+ installed in your computer (or a computer that you have access to). I also recommend using Visual Studio Code for your development environment. That is what I will be using during class demonstrations and lecture materials.

After installing and opening VS code, go to the File menu option and then select the Open Folder... option. Open an empty folder that you created, and that you know the location of in your computer. Create a new file that ends with the .py extension. If you do not have Python install in your machine, VS code will prompt you to install it.

For the next step you will need to have bash in your machine. If you have a windows machine, you will have to install bash – I recommend you install git bash.

Next step is to install sckit-learn, one of the packages that we will be working with in this course. To do this, you can open a bash terminal in your VS code and type the following:

pip install -U scikit-learn

Wait and read the terminal standard output to check if the installation was completed successfully.

Testing your coding environment

Download the csv file for the gpa study hours data and placed it in a folder called data.

Create a new python file with the following:

import pandas

from sklearn.linear_model import LinearRegression

if __name__ == "__main__":
    data = pandas.read_csv("data/gpa_study_hours.csv")
    print(data.head())

    X = data[['study_hours']]  
    y = data['gpa'] 

    model = LinearRegression().fit(X, y)

    print(model.coef_, model.intercept_)

When you run this, it should ouput the following:

    gpa  study_hours
0  4.00         10.0
1  3.80         25.0
2  3.93         45.0
3  3.40         10.0
4  3.20          4.0
[0.00332834] 3.5279974190338232

Troubleshooting

  • Run where python and which python to find which Python installation your terminal has associated with different python aliases.
  • You can install a package to a specific python installation by running something like this: /usr/local/bin/python3 -m pip install statsmodels