Data Analysis with Python and PySpark

Posted By: arundhati

Jonathan Rioux, "Data Analysis with Python and PySpark"
English | ISBN: 1617297208 | 2022 | 456 pages | AZW | 7 MB

Think big about your data! PySpark brings the powerful Spark big data processing engine to the Python ecosystem, letting you seamlessly scale up your data tasks and create lightning-fast pipelines.

In Data Analysis with Python and PySpark you will learn how to:

Manage your data as it scales across multiple machines
Scale up your data programs with full confidence
Read and write data to and from a variety of sources and formats
Deal with messy data with PySpark’s data manipulation functionality
Discover new data sets and perform exploratory data analysis
Build automated data pipelines that transform, summarize, and get insights from data
Troubleshoot common PySpark errors
Creating reliable long-running jobs











































Read more