Python Basics for Data Science

Python is the most popular language for data science due to its simplicity and powerful ecosystem of libraries.

Why Python for Data Science?

Easy to Learn: Clean syntax that reads like English
Rich Ecosystem: NumPy, Pandas, Matplotlib, Scikit-learn
Community Support: Massive community and resources
Industry Standard: Used by top companies for data analysis

Essential Data Types

Python provides built-in data types perfect for data work:

Lists: Ordered, mutable collections [1, 2, 3]
Tuples: Immutable sequences (1, 2, 3)
Dictionaries: Key-value pairs {"name": "Alice"}
Sets: Unique elements {1, 2, 3}

Working with Numbers

Python handles integers and floats seamlessly:

age = 25  # Integer
temperature = 98.6  # Float
result = age * 2  # Arithmetic operations

Control Flow

Make decisions and repeat operations:

if/elif/else: Conditional logic
for loops: Iterate over sequences
while loops: Repeat while condition is true
List comprehensions: Concise list creation

Functions

Functions organize reusable code:

def calculate_average(numbers):
    return sum(numbers) / len(numbers)

Use def keyword to define functions
Parameters allow passing data
Return values with return keyword

Working with Libraries

Import powerful libraries to extend Python:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Best Practices

Use descriptive variable names
Write clear, readable code
Comment complex logic
Follow PEP 8 style guidelines
Use virtual environments for projects

Code Example

# Python basics for data analysis

# Lists and basic operations
temperatures = [72, 75, 68, 71, 73, 69, 70]
print(f"Average temperature: {sum(temperatures) / len(temperatures)}°F")
print(f"Highest: {max(temperatures)}°F")
print(f"Lowest: {min(temperatures)}°F")

# List comprehension
celsius = [(temp - 32) * 5/9 for temp in temperatures]
print(f"Celsius: {[round(t, 1) for t in celsius]}")

# Dictionaries for structured data
student = {
    "name": "Alice",
    "age": 20,
    "grades": [85, 92, 88, 95],
    "major": "Data Science"
}

print(f"{student['name']}'s average: {sum(student['grades']) / len(student['grades'])}")

# Functions for reusable logic
def calculate_statistics(data):
    return {
        "mean": sum(data) / len(data),
        "max": max(data),
        "min": min(data),
        "count": len(data)
    }

stats = calculate_statistics(temperatures)
print(f"Statistics: {stats}")

# Working with files
with open('data.csv', 'r') as file:
    header = file.readline().strip().split(',')
    for line in file:
        values = line.strip().split(',')
        print(dict(zip(header, values)))

Course Lessons

Python Basics for Data Science

Python Basics for Data Science

Why Python for Data Science?

Essential Data Types

Working with Numbers

Control Flow

Functions

Working with Libraries

Best Practices

Code Example