Skip to content

gaurgv/Pandas-Cookbook-Third-Edition

 
 

Repository files navigation

Pandas Cookbook, Third Edition

by William Ayd and Matthew Harrison

cover

Pandas Cookbook, Third Edition

Practical recipes for scientific computing, time series, and exploratory data analysis using Python

This is the code repository for Pandas Cookbook, Third Edition, published by Packt.

About the book 📔

The pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This book guides you, as if you were looking over the shoulder of an expert, through situations that you are highly likely to encounter.

With this latest edition unlock the full potential of pandas 2.x onwards. Whether you're a beginner or an experienced data analyst, this book offers a wealth of practical recipes to help you excel in your data analysis projects. This cookbook covers everything from fundamental data manipulation tasks to advanced techniques for handling big data, visualization, and more. Each recipe is designed to address common real-world challenges, providing clear explanations and step-by-step instructions to guide you through the process.

Explore cutting-edge topics such as idiomatic pandas coding, efficient handling of large datasets, and advanced data visualization techniques.  Whether you're looking to sharpen or expand your skills, the Pandas Cookbook is your essential companion for mastering data analysis and manipulation with pandas 2.x, and beyond.

What you will learn 📖

  • The pandas type system and how to best navigate it
  • Import/export DataFrames to/from common data formats
  • Data exploration in pandas through dozens of practice problems
  • Grouping, aggregation, transformation, reshaping, and filtering data
  • Merge data from different sources through pandas SQL-like operations
  • Leverage the robust pandas time series functionality in advanced analyses
  • Scale pandas operations to get the most out of your system
  • The large ecosystem that pandas can coordinate with and supplement

Table of Contents📑

  1. pandas Foundations
  2. Selection and Assignment
  3. Data Types
  4. The pandas I/O System
  5. Algorithms and How to Apply Them
  6. Visualization
  7. Reshaping DataFrames
  8. Group By
  9. Temporal Data Types and Algorithms
  10. General Usage/Performance Tips
  11. The pandas Ecosystem

Getting started 🚀

The code in this book will make use of the pandas, NumPy, and PyArrow libraries. Jupyter Notebook files are also a popular way to visualize and inspect code. All of these libraries should be installable via pip or the package manager of your choice. For pip users, you can run:

python -m pip install pandas numpy pyarrow notebook

Running a Jupyter notebook 💻

The suggested method to work through the content of this book is to have a Jupyter notebook up and running so that you can run the code while reading through the recipes. Following along on your computer allows you to go off exploring on your own and gain a deeper understanding than by just reading the book alone.

After installing Jupyter notebook, open a Command Prompt (type cmd at the search bar on Windows, or open Terminal on Mac or Linux) and type:

jupyter notebook

Raise an issue 🚩

If you see anything that doesn't run as expected, raise an issue, and we'll work on it!

You can create an issue Support, if you encounter any in the notebooks, we will be glad to provide you support.

Get my copy 📦

If you feel this book is for you, get your copy today! Coding

Know more on the Discord server Coding

Join our community's Discord space to ask questions, provide solutions to other readers, discussions with the authors and much more.

Download a free PDF Coding

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost. Simply click here to claim your Free PDF. free pdf

We also provide a PDF file that has color images of the screenshots/diagrams used in this book at ColorImages. color images

Get to Know the Authors

William Ayd is a core maintainer of the pandas project, serving in that role since 2018. For over a decade working as a consultant, Will has helped countless clients get the most value from their data using pandas and the open-source ecosystem surrounding it.

Matthew Harrison has been using Python since 2000. He runs MetaSnake, which provides corporate training for Python and data science. He is the author of Machine Learning Pocket Reference, the bestselling Illustrated Guide to Python 3, and Learning the Pandas Library, among other books.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%