Blog

Nutrition Data Storage Hypothetical

How much data does food require in the United States? Well, let’s find out. I’ll be using C, since it’s the lowest-level language many programmers are comfortable with, and it is plenty efficient enough to handle this task.

The data required for one label can be expressed in a C struct with 42 data items, including everything from vitamins, to brand name, to serving size. These are all the data points that are found on Nutrition Facts labels that cannot be reasonably calculated otherwise. This data structure is a type that is used to contain and manage many variables that contain the same structure of data.

I calculate 224 bytes per nutrition label using the C sizeof() function to account for language-dependent padding:

  • 50 letters each for food and brand names (100 bytes)
  • 10 letters for serving size and total units (“container” being the longest) (20 bytes)
  • 14 32-bit floating point numbers for decimal-significant numbers (56 bytes)
  • 2 32-bit floating point numbers for servings, to convert to fractions on the label (8 bytes)
  • 24 unsigned 16-bit integers (unsigned shorts) for numbers between 0-65,535 (48 bytes)

The original code can be found here: https://fwylupek.com/code/nutrition_data.c

You might be wondering where the calories have gone. The macros fat, protein, and carbs will be used to calculate calories. Protein and carbs each account for 4 calories per gram, while fat is 9 calories per gram. Percentage of daily values will also be calculated, especially since they are prone to change over time.

So then, how many labels are needed? Open Food Facts maintains a collaborative database of 347,507 foods in the United States alone, at the time of writing. This should account for about 20 years’ worth of food, seeing as the USDA records about 20,000 new foods per year.

347,507 labels at 224 bytes each equals 77,841,568 bytes, or 74.24 megabytes of data! Is that more or less than you expected?

Greenshot

A free, feature-rich open source screenshot application for Windows.

Ah, today is a good day to write about open source software. Let’s look at an increasingly normal workflow for people:

  1. click on Snipping Tool
  2. click “New”
  3. squint closely at the screen to select the desired pixels
  4. repeat steps 2 and 3 until the selection is correct
  5. click “File” > “Save As…”
  6. finally, choose a destination folder and manually rename the file from “Capture.PNG”
  7. repeat starting at step 2 for each selection

Snipping Tool is abandonware with a dated, cumbersome, limited interface. It supports only four file types, PNG, GIF, JPG, and the oft-ignored, and hardly supported MHT, without options for compression. Snip & Sketch at first seemed like a drop-in replacement with shortcut keys, but its lackluster, mobile-quality features and near to no options made it a non-starter for me, and a downgrade in some respects.

The Greenshot logo.

Well, let’s put those troubles behind us, because today I want to write to you about Greenshot: a free and open source screenshot software that supports selected region, window, and fullscreen screenshots. Greenshot has too many options to list here. However, it solves every problem I outlined with the Microsoft alternatives, and goes so much further.

Here’s a quick rundown of my favorite features, which are always just one rebindable hotkey away:

  • a magnifying cursor for precise selection
  • output options, including filename formatting, and a compression slider
  • external commands
  • integration for services such as Flickr, Imgur, and MS Office
  • a handy context menu for quick preferences
  • an editor with shapes, text, copy, paste, and more!
Snipe those borders perfectly with this brilliant magnifier.

Please, allow me to take a moment to appreciate using a single file for settings. Greenshot uses a single greenshot.ini file, meaning migrating or backing up my preferences is as easy as copying one file.

I highly recommend checking out the official Greenshot.org, or its git repo. I give this application 10 “quality desktop tools made for this decade” … out of 10.

Wen

How and why I created Wen: Chinese Character practice.

Image of farmland between mountains on a foggy day.
The view from the train window.

On a train ride from Hangzhou to Guilin, China, I began writing a program to help learn Chinese characters. The idea is simple: present a character on the screen, with or without Pinyin, and the user will input the translation. The program presents a score at the end.

This original Python iteration uses dictionary files with comma-separated components: the Chinese character, the Pinyin with tone numbers, and the definition separated by forward slashes (“/”). An example would be:

零, li2ng, 0/zero

The program then splits up each section, replaces the tone numbers with diacritics (i.e. “líng”), and compares user input to each instance of the definitions. You can look at the code for this original command-line version here.

However, this is the year 2021, and users, including myself, expect an application like this to have a graphical interface, perhaps even over the web. I would even venture to say that some among us would even launch the program on their phones. Ask no more, the online, user-friendly version can be found here, and a live version is hosted here.

What began as a curious venture into the uncharted world of JavaScript, the aptly-named Wen Online is the culmination of my week-long journey of learning the language. Wen Online features state-of-the-art drop-down menus and bleeding-edge submit and review buttons.

Rendered in the beautiful default browser font, this program is lightning-fast and just sips data at 2-3 kilobytes per chapter (excluding the > 500 word cumulative exam, clocking in at a whopping 57 kilobytes).

All jokes aside, I have worked on larger projects, and for longer, but Wen has been the one I am most proud of. Chinese is a notoriously hard language to learn, and after studying for more than 5 years, I still have light-years to go. I personally use Wen multiple times per week, and have found it to work better for rote memorization than anything else, including flashcards. I think this is because of the need for input, further securing my memory through typing.