Ben Day Logo

...get to know your data, not all data is the same...

Explore

Understanding the data and how it's structured, helps me establish context and identify what questions I can answer. 

Quality Data Sources

Ensuring the integrity of a dataset provides a strong foundation and confidence that my analysis will give the best possible result. When working with a new dataset, I check the reliability of the source, completeness, and relevancy to the project. 

Process

Once I become familiar with a dataset, I process it ready for analysis. This includes any cleaning and transformation that may be required.

Data Cleaning

I clean the data to make sure it is complete and correct, otherwise my analysis may lead to inaccurate insights. The process I use varies depending on the source.

Generally, I check for and complete the following activities:

I verify any changes made to ensure they were executed appropriately.

Transformation

Transforming data helps me organise the data and makes it easier to use.

During this process, I:

I perform other transformations during analysis as required by the project. I adhere to Matthew Roche's general rule of "data should be transformed as far upstream as possible, and as far downstream as necessary". Meaning, if I'm making a transformation, I make it as close to the source as practically possible. This helps to:

Tools

The tools I use to inspect and process data vary depending on the source and requirements of the project.

Below is an example where I used R for my '2021 Cyclistic Riders' project. 

Accountability

Processing data can result in a significantly different dataset. I create transparency by reporting my results.

Document

Documenting the steps I use to process data, ensures accountability that the steps were executed appropriately and the resulting data is reliable. This can also be used as a future reference for similar data processing requirements.

Change Log

I use Change Logs to document a list of notable modifications made to the data with explanations as to why they were made. This helps me communicate to my stakeholders and colleagues about the changes, which creates transparency and accountability.

Below is an example of a Change Log for my '2021 Cyclistic Riders' project.

2021_Cyclistic_Riders_Change_Log.pdf

Contact

If you would like to discuss job opportunities and how I can help you to deliver, I can be contacted through LinkedIn.


For a summary of the value and contributions I have made throughout my career, my resume is available on request.