Python Tools for Data Analysis and Visualization Training
About the Program
Through a series of hands-on exercises, students will learn to turn data into actionable information. The world is drowning in data. Each day 2.5 Exabytes of data (250 new Library of Congresses built or 90 years of HD video) is produced. The problem is getting the data into a format which can be used by tools that help in understanding and verifying the data. Python programming is relatively quick to learn and has a great set of tools for importing, transforming, exploring, extracting insights from, making predictions with, and exporting the data. This course introduces the major Python tools used for preparing the data for analysis, the tools available for understanding the data, and using the data for insights and predictions. All class work and exercises are done in Python 3.x.
Program Objective
- Learn how to use Jupyter notebooks
- Learn how to work with NumPy datatypes
- Be proficient in pandas Series
- Be proficient in pandas DataFrames
- Understand how to use data visualization
- Know how to import and clean data
- Introduce statistical tools for working with data sets
- An introduction to the problems of working with PDF data sources
- Introduce machine learning tools for working with data sets
Target Audience
- Data Analysts.
- Data Scientists.
- Business Analysts.
- Financial Analysts.
- Researchers.
- Software.
- Developers.
- Data Engineers.
- Statisticians.
Training Period
- Classroom: 5 Days
- Online: 7 Days
Module 1: Advanced Python Review.
- A Python Development Environment.
- A Review of Data type.
- The New Class Structure.
- Python Best Practices.
Module 2: Ipython Notebook.
- Functionality Provided – Why Use Them?
- CRUD for Notebooks.
- Interface and Shortcuts.
Module 3: NumPy.
- Datatypes.
- Universal Functions.
- Indexing.
- Summary Methods.
- Sorting.
- Computations and Broadcasting.
Module 4: SciPy.
- Overview of SciPy.
- Statistical Functions.
- Panda: Series.
- Pandas Series Structure.
- Series CRUD.
- Indexing and Access Techniques.
- Data Methods.
Module 5: Pandas: DataFrame Basics.
- DataFrame Construction.
- DataFrame Change and Reorganization.
- Indexing and Access Techniques.
- Grouping, Pivoting, and Reshaping.
- DataFrame CRUD.
- Pandas DataFrame: Data Manipulation.
- Statistics.
- Data Methods.
- Missing Data Tools.
Module 6: Understanding Data Visualization.
- Visualization Is Storytelling.
- Types of Charts.
- Colors Yes and No.
- Common Mistakes.
- Best Practices.
- Reproducibility.
- Matplotlib for Data Visualization.
- Steps for Creating a Data Visualization.
- Jupyter Notebooks and Matplotlib.
- Matplotlib Styles.
- Small Multiples.
- Panda Series Plotting.
- Panda Dataframe Plotting.
- Advanced Techniques.
- Seaborn.
- Bokeh.
Delivery Method
This program is taught through a mix of practical activities, theory, group work and case studies. Training manuals and additional reference materials are provided to the participants.
CERTIFICATION
Upon successful completion of the training, participants will be awarded a certificate of course completion
Related Services
Business Operation StrategyStrategic Plan Development
Balance score-card Development
Customer Experience(CX)
Human Resource Development & Development
Market Surveys & Mapping
Baseline Surveys
End-line Surveys
Impact Evaluation
Risk Management
Internal Control Services
Performance Management & Operational Excellencey