Introduction to Git and GitHub
Learn version control with Git and collaborate with GitHub - essential skills for modern software development and data science.
What is Git?
Git is a distributed version control system that tracks changes in your code over time. It allows you to:
- Save snapshots of your project at different points in time
- Collaborate with others without overwriting each other's work
- Experiment with new features without breaking working code
- Revert to previous versions when things go wrong
- Track who made what changes and when
What is GitHub?
GitHub is a cloud-based hosting service for Git repositories. It provides:
- Remote storage for your Git repositories
- Collaboration tools (pull requests, issues, discussions)
- Code review features
- Project management tools
- CI/CD integration
- Portfolio for showcasing your work
Why Learn Git?
For Data Scientists
- Track changes to data analysis notebooks
- Collaborate on machine learning projects
- Version datasets and model configurations
- Share reproducible research
- Manage experiment tracking
For Developers
- Industry standard for version control
- Required for most tech jobs
- Essential for open-source contribution
- Enables team collaboration
- Facilitates code review processes
Key Concepts
Repository (Repo)
A folder that Git tracks. Contains all your project files and the complete history of changes.
Commit
A snapshot of your project at a specific point in time. Like saving your game progress.
Branch
An independent line of development. Allows you to work on features without affecting the main code.
Remote
A version of your repository hosted on the internet (usually on GitHub).
Clone
Creating a local copy of a remote repository on your computer.
Push
Uploading your local commits to a remote repository.
Pull
Downloading changes from a remote repository to your local machine.
Git vs GitHub
| Git | GitHub |
|---|---|
| Version control system | Hosting service for Git repositories |
| Installed on your computer | Cloud-based platform |
| Command-line tool | Web interface + additional features |
| Works offline | Requires internet connection |
| Free and open-source | Free for public repos, paid for advanced features |
Installing Git
Windows
- Download from git-scm.com
- Run the installer
- Use default settings (or customize as needed)
- Verify: Open Git Bash and run
git --version
macOS
Verify installation:
Linux (Ubuntu/Debian)
Verify installation:
Initial Git Configuration
After installing Git, configure your identity:
# Set your name
git config --global user.name "Your Name"
# Set your email
git config --global user.email "your.email@example.com"
# Check your configuration
git config --list
Use Your GitHub Email
If you plan to use GitHub, use the same email address you'll use for your GitHub account.
Creating a GitHub Account
- Go to github.com
- Click "Sign up"
- Follow the registration process
- Verify your email address
- Complete your profile
Basic Workflow Overview
Here's a typical Git workflow:
1. Initialize or clone a repository
↓
2. Make changes to files
↓
3. Stage changes (git add)
↓
4. Commit changes (git commit)
↓
5. Push to remote (git push)
↓
6. Repeat steps 2-5
Your First Git Repository
Let's create your first Git repository:
# Create a new folder
mkdir my-first-repo
cd my-first-repo
# Initialize Git
git init
# Check status
git status
You should see:
Creating Your First Commit
# Create a file
echo "# My First Repo" > README.md
# Check status
git status
# Stage the file
git add README.md
# Commit
git commit -m "Initial commit: Add README"
# View commit history
git log
Common Git Terms Explained
Working Directory
The folder on your computer where you're currently working. Contains your actual files.
Staging Area (Index)
A intermediate area where you prepare files before committing. Think of it as a "preview" of your next commit.
Local Repository
The .git folder in your project that stores all commits and history on your computer.
Remote Repository
A version of your project hosted on a server (like GitHub).
HEAD
A pointer to your current branch and commit. Think of it as "you are here."
Git Workflow Diagram
Working Directory → Staging Area → Local Repository → Remote Repository
git add git commit git push
← ← ←
git restore git reset git pull
Best Practices
- Commit Often - Make small, focused commits
- Write Clear Messages - Explain what and why, not how
- Pull Before Push - Always sync before sharing your changes
- Use Branches - Don't work directly on main/master
- Review Changes - Check
git statusandgit diffbefore committing - .gitignore - Don't commit sensitive data or generated files
What Not to Commit
Never commit: - Passwords or API keys - Large binary files (use Git LFS instead) - Dependencies (node_modules, venv, etc.) - Personal configuration files - Temporary files - Compiled code (unless necessary)
Create a .gitignore file to exclude these automatically.
Getting Help
# General help
git help
# Help for a specific command
git help commit
git commit --help
# Quick reference
git <command> -h
Resources
- Official Git Documentation
- GitHub Guides
- Pro Git Book (Free online)
- Git Cheat Sheet
What's Next?
In the following tutorials, you'll learn:
- Basic Git Commands - add, commit, status, log
- Branching and Merging - Create and manage branches
- GitHub Basics - Push, pull, clone, fork
- Collaboration - Pull requests, code review
- Advanced Topics - Rebase, cherry-pick, stash
You're Ready!
You now understand the basics of Git and GitHub. Let's dive deeper into the commands and workflows.
Next Tutorial: Basic Git Commands