In this project, I conducted a comprehensive data analysis of Auralin, an oral insulin being researched as a substitute for Novodra, a popular injectable insulin. The analysis aimed to determine whether Auralin is as effective as Novodra in managing diabetes.
Removing duplicates, handling missing values, and correcting errors to ensure the data is ready for analysis.
Creating meaningful features from raw data to improve model performance and insights.
Understanding the underlying patterns and distributions within the data to guide further analysis.
To address the core challenges in data analysis, I implemented a structured approach to ensure data accuracy and meaningful insights. By systematically cleaning and preprocessing the data, creating valuable features, and visualizing the results, I was able to overcome the complexities involved.
I manually and automatically checked the data to identify and separate issues into dirty and messy categories.
Conducted data cleaning and preprocessing, and created new features to improve analysis.
Used tools like Matplotlib and Plotly to create graphs and visualizations, helping to uncover insights.
This data integration process consolidates various CSV files, streamlining analysis and visualization. By merging data from multiple CSV sources, I provided a unified view of the dataset, enhancing the client's ability to derive insights and make informed decisions. The data sources utilized for this
Import data from four CSV files: patients, treatments, treatment_cut, and adverse_reaction.
Provide a summary of what the data is about.
Describe the columns for different tables.
Manual and automatic assessment of data to identify issues.
Resolve issues and prepare data for analysis.
Create different charts and graphs to gain insights.
Document insights and conclusions from the analysis.