Have you ever find yourself using way too many columns to describe your data? It almost feels like data spanning across 26 columns (column A to Z) were still not enough to describe everything. Worst still, the proliferation of columns could go beyond double alphabets (52 columns and counting), making your spreadsheet very hard to view even with a 30 inch monitor.
The downside of using way too many columns was not only bad for viewing your data. You might already accustomed to freezing and hiding columns to “see” your data better, but the result still beyond satisfactory. If columns were not used in a justifiable manner, it is also bad for manipulation (slicing and dicing) and analysis as well. This definitely affects your productivity and the quality of your research output.
The following explain how you could stop overusing columns, what are the benefits and when columns could be considered too many.
☢️ Stoping Excel Column Meltdown
Very often, we use multiple columns for a “spread out” view in keeping our data (In Excel term it means the dataset has been “pivoted”, more on that in another blog). For example, naming columns by month (e.g. January, February, March, etc.), shades of colours (e.g. red, green, blue etc.) and locations (e.g. District 11, District 12, District 13).
If you look at these columns closely, you will find that they are categorical, meaning that you can group them under a name to describe it all. For example, you could group January, February, March by Month, shades of colours by Colour (obviously) and locations by, well, Location (surprise, surprise!). With this simple step, the need of 12 columns to describe months will become 1, all shades of colour can be put under 1 column and so as various locations under 1 single column.
The benefit of consolidating your data are twofold:
- This save yourself from scrolling endlessly across to find your data and freeing up precious column estates for other data. This makes the dataset much “condense” and “fuss free” for manipulation (slice and dice) and analysis.
- Functions can be written much easier for all attribute under the same category. For example, you would use UNIQUE to find all shades of colours under the same column. Gone the days you need to savvily freeze and hide columns to scroll across spreadsheet for comparison.
🤔 How Many Columns Are Too Many?
The use of excessive amount columns, say going beyond double alphabets (i.e. beginning with AA), could be a good indicator of using way too many columns. You should take a second look at the columns again for possible consolidation.
Wickham, H. (2014) “Tidy Data”. Journal of Statistical Software, 59(10), 1-23. URL: https://www.researchgate.net/publication/215990669_Tidy_data