Lecture 1
University of Arizona
INFO 511 - Fall 2024
If you have not yet completed the Getting to know you survey, please do so asap!
If you have not yet accepted the invite to join the course GitHub Organization, please do so asap!
Office hours linked at https://datasciaz.netlify.app/course-team.html
Let’s take a tour!
Only work that is clearly assigned as team work should be completed collaboratively.
Homeworks must be completed individually. You may not directly share answers / code with others, however you are welcome to discuss the problems in general and ask for advice.
Exams must be completed individually. You may not discuss any aspect of the exam with peers. If you have questions, post as private questions on the course forum, only the teaching team will see and answer.
We are aware that a huge volume of code is available on the web, and many tasks may have solutions posted
Unless explicitly stated otherwise, this course’s policy is that you may make use of any online resources (e.g. RStudio Community, StackOverflow, etc.) but you must explicitly cite where you obtained any code you directly use or use as inspiration in your solution(s).
Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism, regardless of source
Treat generative AI, such as ChatGPT, as an online resource.
Guiding principles:
(1) Cognitive dimension: Working with AI should not reduce your ability to think clearly. We will practice using AI to facilitate—rather than hinder—learning.
(2) Ethical dimension: Students using AI should be transparent about their use and make sure it aligns with academic integrity.
✅ AI tools for code: You may make use of the technology for coding examples on assignments; if you do so, you must explicitly cite where you obtained the code.
❌ AI tools for narrative: Unless instructed, you may not use generative AI to write narrative on assignments. You may use generative AI as a resource as you complete assignments but not for answers.
To uphold the UArizona iSchool Community Standard:
Ask if you’re not sure if something violates a policy!
Complete all the preparation work before class.
Ask questions.
Do the readings.
Do the lab.
Don’t procrastinate – at least on a weekly basis!
Course operation
Doing data science
By the end of the course, you will be able to…
What does it mean for a data analysis to be “reproducible”?
Short-term goals:
Long-term goals:
Packages: Fundamental units of reproducible Python code, including reusable Python modules/functions, the documentation that describes how to use them, and sample data1
As of 23 July 2024, there are 557,005 Python packages (projects) available on PyPI (the Python Package Index)2
We’re going to work with a small (but important) subset of these!
Option 1:
Sit back and enjoy the show!
Option 2:
Clone the corresponding application exercise repo and follow along.
ae-01-meet-the-penguins
Go to the course GitHub organization and clone ae-01-meet-the-penguins
to your environment.
['']
:Option 1:
Sit back and enjoy the show!
Option 2:
Clone the corresponding application exercise repo and follow along.
ae-01-meet-the-penguins
Go to the course GitHub organization and clone ae-01-meet-the-penguins
to your environment.