Even before we start analyzing data, we have to acquire data
There are a lot of data sets openly availabe on the internet, for example:
Even before we start analyzing data, we need to make sure our data is tidy
What to look for:
Rows and Columns
Day | High | Low | Wind | Forecast |
---|---|---|---|---|
Tuesday | 24 | 15 | 0 to 15 mph | Sunny |
Wednesday | 38 | 17 | 5 to 15 mph | Mostly Sunny |
Thursday | 34 | 13 | 5 to 15 mph | Mostly Sunny |
Tuesday:
↳ Temperature:
↳ Low: 15
↳ High: 24
↳ Wind:
↳ Speed: 0 to 15 mph
↳ Direction: West
Wednesday:
↳ Temperature:
↳ Low: 17
↳ High: 38
↳ Wind:
↳ Speed: 5 to 15 mph
↳ Direction: North West
One winter, I became very quiet
and saw my life. It was February
and outside in the city streets,
snow fell but would not collect.
I bought snapdragons and thistle,
got some discount peach roses
that smelled off. I split them
between vases and moved
the bouquets from room to room
while a violin solo rang out.
Answer the gradescope questions on Data Formats
Remember that you can click on save
for each answer to get feedback on whether you got the answer correct before clicking on submit
.xlsx
files).csv
files, or comma separated values).tsv
files, or comma separated values)API
s often provide this type of dataInspect this data set:
What about this data set
Inspect this weather data
These data were retrieved using the weather.gov api
Get data on mortality by country
The data you will be working with is often too large for excel.
Even before we start analyzing data, we need to make sure our data is tidy
We can wrangle our raw data to make it tidy
Each column is a variable, each row is an observation
Take a look at the Apple Sales Dataset on Kaggle and answer the following questions: